On 21 February 2018 at 21:11, Dan Ragle <[email protected]> wrote:
> > > On 2/3/2018 8:58 AM, Dan Ragle wrote: > >> >> >> On 2/2/2018 2:13 AM, Nithya Balachandran wrote: >> >>> Hi Dan, >>> >>> It sounds like you might be running into [1]. The patch has been posted >>> upstream and the fix should be in the next release. >>> In the meantime, I'm afraid there is no way to get around this without >>> restarting the process. >>> >>> Regards, >>> Nithya >>> >>> [1]https://bugzilla.redhat.com/show_bug.cgi?id=1541264 >>> >>> >> Much appreciated. Will watch for the next release and retest then. >> >> Cheers! >> >> Dan >> >> > FYI, this looks like it's fixed in 3.12.6. Ran the test setup with > repeated ls listings for just shy of 48 hours with no increase in RAM > usage. Next will try my production application load for awhile to see if it > holds steady. > > The gf_dht_mt_dht_layout_t memusage num_allocs went quickly up to 105415 > and then stayed there for the entire 48 hours. > > Excellent. Thanks for letting us know. Nithya > Thanks for the quick response, > > Dan > > >>> On 2 February 2018 at 02:57, Dan Ragle <[email protected] <mailto: >>> [email protected]>> wrote: >>> >>> >>> >>> On 1/30/2018 6:31 AM, Raghavendra Gowdappa wrote: >>> >>> >>> >>> ----- Original Message ----- >>> >>> From: "Dan Ragle" <[email protected]> >>> To: "Raghavendra Gowdappa" <[email protected] >>> <mailto:[email protected]>>, "Ravishankar N" >>> <[email protected] <mailto:[email protected]>> >>> Cc: [email protected] >>> <mailto:[email protected]>, "Csaba Henk" >>> <[email protected] <mailto:[email protected]>>, "Niels de Vos" >>> <[email protected] <mailto:[email protected]>>, "Nithya >>> Balachandran" <[email protected] <mailto: >>> [email protected]>> >>> Sent: Monday, January 29, 2018 9:02:21 PM >>> Subject: Re: [Gluster-users] Run away memory with gluster >>> mount >>> >>> >>> >>> On 1/29/2018 2:36 AM, Raghavendra Gowdappa wrote: >>> >>> >>> >>> ----- Original Message ----- >>> >>> From: "Ravishankar N" <[email protected] >>> <mailto:[email protected]>> >>> To: "Dan Ragle" <[email protected]>, >>> [email protected] >>> <mailto:[email protected]> >>> Cc: "Csaba Henk" <[email protected] >>> <mailto:[email protected]>>, "Niels de Vos" >>> <[email protected] <mailto:[email protected]>>, >>> "Nithya Balachandran" <[email protected] >>> <mailto:[email protected]>>, >>> "Raghavendra Gowdappa" <[email protected] >>> <mailto:[email protected]>> >>> Sent: Saturday, January 27, 2018 10:23:38 AM >>> Subject: Re: [Gluster-users] Run away memory with >>> gluster mount >>> >>> >>> >>> On 01/27/2018 02:29 AM, Dan Ragle wrote: >>> >>> >>> On 1/25/2018 8:21 PM, Ravishankar N wrote: >>> >>> >>> >>> On 01/25/2018 11:04 PM, Dan Ragle wrote: >>> >>> *sigh* trying again to correct >>> formatting ... apologize for the >>> earlier mess. >>> >>> Having a memory issue with Gluster >>> 3.12.4 and not sure how to >>> troubleshoot. I don't *think* this is >>> expected behavior. >>> >>> This is on an updated CentOS 7 box. The >>> setup is a simple two node >>> replicated layout where the two nodes >>> act as both server and >>> client. >>> >>> The volume in question: >>> >>> Volume Name: GlusterWWW >>> Type: Replicate >>> Volume ID: >>> 8e9b0e79-f309-4d9b-a5bb-45d065faaaa3 >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 1 x 2 = 2 >>> Transport-type: tcp >>> Bricks: >>> Brick1: >>> vs1dlan.mydomain.com:/glusterf >>> s_bricks/brick1/www >>> Brick2: >>> vs2dlan.mydomain.com:/glusterf >>> s_bricks/brick1/www >>> Options Reconfigured: >>> nfs.disable: on >>> cluster.favorite-child-policy: mtime >>> transport.address-family: inet >>> >>> I had some other performance options in >>> there, (increased >>> cache-size, md invalidation, etc) but >>> stripped them out in an >>> attempt to >>> isolate the issue. Still got the problem >>> without them. >>> >>> The volume currently contains over 1M >>> files. >>> >>> When mounting the volume, I get (among >>> other things) a process as such: >>> >>> /usr/sbin/glusterfs >>> --volfile-server=localhost >>> --volfile-id=/GlusterWWW /var/www >>> >>> This process begins with little memory, >>> but then as files are >>> accessed in the volume the memory >>> increases. I setup a script that >>> simply reads the files in the volume one >>> at a time (no writes). It's >>> been running on and off about 12 hours >>> now and the resident >>> memory of the above process is already >>> at 7.5G and continues to grow >>> slowly. If I stop the test script the >>> memory stops growing, >>> but does not reduce. Restart the test >>> script and the memory begins >>> slowly growing again. >>> >>> This is obviously a contrived app >>> environment. With my intended >>> application load it takes about a week >>> or so for the memory to get >>> high enough to invoke the oom killer. >>> >>> >>> Can you try debugging with the statedump >>> (https://gluster.readthedocs.i >>> o/en/latest/Troubleshooting/statedump/#read-a-statedump >>> <https://gluster.readthedocs.i >>> o/en/latest/Troubleshooting/statedump/#read-a-statedump>) >>> of >>> the fuse mount process and see what member >>> is leaking? Take the >>> statedumps in succession, maybe once >>> initially during the I/O and >>> once the memory gets high enough to hit the >>> OOM mark. >>> Share the dumps here. >>> >>> Regards, >>> Ravi >>> >>> >>> Thanks for the reply. I noticed yesterday that >>> an update (3.12.5) had >>> been posted so I went ahead and updated and >>> repeated the test >>> overnight. The memory usage does not appear to >>> be growing as quickly >>> as is was with 3.12.4, but does still appear to >>> be growing. >>> >>> I should also mention that there is another >>> process beyond my test app >>> that is reading the files from the volume. >>> Specifically, there is an >>> rsync that runs from the second node 2-4 times >>> an hour that reads from >>> the GlusterWWW volume mounted on node 1. Since >>> none of the files in >>> that mount are changing it doesn't actually >>> rsync anything, but >>> nonetheless it is running and reading the files >>> in addition to my test >>> script. (It's a part of my intended production >>> setup that I forgot was >>> still running.) >>> >>> The mount process appears to be gaining memory >>> at a rate of about 1GB >>> every 4 hours or so. At that rate it'll take >>> several days before it >>> runs the box out of memory. But I took your >>> suggestion and made some >>> statedumps today anyway, about 2 hours apart, 4 >>> total so far. It looks >>> like there may already be some actionable >>> information. These are the >>> only registers where the num_allocs have grown >>> with each of the four >>> samples: >>> >>> [mount/fuse.fuse - usage-type gf_fuse_mt_gids_t >>> memusage] >>> ---> num_allocs at Fri Jan 26 08:57:31 2018: >>> 784 >>> ---> num_allocs at Fri Jan 26 10:55:50 2018: >>> 831 >>> ---> num_allocs at Fri Jan 26 12:55:15 2018: >>> 877 >>> ---> num_allocs at Fri Jan 26 14:58:27 2018: >>> 908 >>> >>> [mount/fuse.fuse - usage-type >>> gf_common_mt_fd_lk_ctx_t memusage] >>> ---> num_allocs at Fri Jan 26 08:57:31 2018: >>> 5 >>> ---> num_allocs at Fri Jan 26 10:55:50 2018: >>> 10 >>> ---> num_allocs at Fri Jan 26 12:55:15 2018: >>> 15 >>> ---> num_allocs at Fri Jan 26 14:58:27 2018: >>> 17 >>> >>> [cluster/distribute.GlusterWWW-dht - usage-type >>> gf_dht_mt_dht_layout_t >>> memusage] >>> ---> num_allocs at Fri Jan 26 08:57:31 2018: >>> 24243596 >>> ---> num_allocs at Fri Jan 26 10:55:50 2018: >>> 27902622 >>> ---> num_allocs at Fri Jan 26 12:55:15 2018: >>> 30678066 >>> ---> num_allocs at Fri Jan 26 14:58:27 2018: >>> 33801036 >>> >>> Not sure the best way to get you the full dumps. >>> They're pretty big, >>> over 1G for all four. Also, I noticed some >>> filepath information in >>> there that I'd rather not share. What's the >>> recommended next step? >>> >>> >>> Please run the following query on statedump files and >>> report us the >>> results: >>> # grep itable <client-statedump> | grep active | wc -l >>> # grep itable <client-statedump> | grep active_size >>> # grep itable <client-statedump> | grep lru | wc -l >>> # grep itable <client-statedump> | grep lru_size >>> # grep itable <client-statedump> | grep purge | wc -l >>> # grep itable <client-statedump> | grep purge_size >>> >>> >>> Had to restart the test and have been running for 36 hours >>> now. RSS is >>> currently up to 23g. >>> >>> Working on getting a bug report with link to the dumps. In >>> the mean >>> time, I'm including the results of your above queries for >>> the first >>> dump, the 18 hour dump, and the 36 hour dump: >>> >>> # grep itable glusterdump.153904.dump.1517104561 | grep >>> active | wc -l >>> 53865 >>> # grep itable glusterdump.153904.dump.1517169361 | grep >>> active | wc -l >>> 53864 >>> # grep itable glusterdump.153904.dump.1517234161 | grep >>> active | wc -l >>> 53864 >>> >>> # grep itable glusterdump.153904.dump.1517104561 | grep >>> active_size >>> xlator.mount.fuse.itable.active_size=53864 >>> # grep itable glusterdump.153904.dump.1517169361 | grep >>> active_size >>> xlator.mount.fuse.itable.active_size=53863 >>> # grep itable glusterdump.153904.dump.1517234161 | grep >>> active_size >>> xlator.mount.fuse.itable.active_size=53863 >>> >>> # grep itable glusterdump.153904.dump.1517104561 | grep lru >>> | wc -l >>> 998510 >>> # grep itable glusterdump.153904.dump.1517169361 | grep lru >>> | wc -l >>> 998510 >>> # grep itable glusterdump.153904.dump.1517234161 | grep lru >>> | wc -l >>> 995992 >>> >>> # grep itable glusterdump.153904.dump.1517104561 | grep >>> lru_size >>> xlator.mount.fuse.itable.lru_size=998508 >>> # grep itable glusterdump.153904.dump.1517169361 | grep >>> lru_size >>> xlator.mount.fuse.itable.lru_size=998508 >>> # grep itable glusterdump.153904.dump.1517234161 | grep >>> lru_size >>> xlator.mount.fuse.itable.lru_size=995990 >>> >>> >>> Around 1 million of inodes in lru table!! These are the inodes >>> kernel has just cached and no operation is currently progress on >>> these inodes. This could be the reason for high memory usage. >>> We've a patch being worked on (merged on experimental branch >>> currently) [1], that will help in these sceanrios. In the >>> meantime can you remount glusterfs with options >>> --entry-timeout=0 and --attribute-timeout=0? This will make sure >>> that kernel won't cache inodes/attributes of the file and should >>> bring down the memory usage. >>> >>> I am curious to know what is your data-set like? Is it the case >>> of too many directories and files present in deep directories? I >>> am wondering whether a significant number of inodes cached by >>> kernel are there to hold dentry structure in kernel. >>> >>> [1] https://review.gluster.org/#/c/18665/ >>> <https://review.gluster.org/#/c/18665/> >>> >>> >>> OK, remounted with your recommended attributes and repeated the >>> test. Now the mount process looks like this: >>> >>> /usr/sbin/glusterfs --attribute-timeout=0 --entry-timeout=0 >>> --volfile-server=localhost --volfile-id=/GlusterWWW /var/www >>> >>> However after running for 36 hours it's again at about 23g (about >>> the same place it was on the first test). >>> >>> A few metrics from the 36 hour mark: >>> >>> num_allocs for [cluster/distribute.GlusterWWW-dht - usage-type >>> gf_dht_mt_dht_layout_t memusage] is 109140094. Seems at least >>> somewhat similar to the original test, which had 117901593 at the 36 >>> hour mark. >>> >>> The dump file at the 36 hour mark had nothing for lru or lru_size. >>> However, at the dump two hours prior it had: >>> >>> # grep itable glusterdump.67299.dump.1517493361 | grep lru | wc -l >>> 998510 >>> # grep itable glusterdump.67299.dump.1517493361 | grep lru_size >>> xlator.mount.fuse.itable.lru_size=998508 >>> >>> and the same thing for the dump four hours later. Are these values >>> only relevant when the ls -R is actually running? I'm thinking the >>> 36 hour dump may have caught the ls -R between runs there (?) >>> >>> The data set is multiple Web sites. I know there's some litter there >>> we can clean up, but I'd guess not more than 200-300k files or so. >>> The biggest culprit is a single directory that we use as a >>> multi-purpose file store, with filenames stored as GUIDs and linked >>> to a DB. That directory currently has 500k+ files. Another directory >>> serves a similar purpose and has about 66k files in it. The rest is >>> generally distributed more "normally", I.E., a mixed nesting of >>> directories and files. >>> >>> Cheers! >>> >>> Dan >>> >>> >>> >>> # grep itable glusterdump.153904.dump.1517104561 | grep >>> purge | wc -l >>> 1 >>> # grep itable glusterdump.153904.dump.1517169361 | grep >>> purge | wc -l >>> 1 >>> # grep itable glusterdump.153904.dump.1517234161 | grep >>> purge | wc -l >>> 1 >>> >>> # grep itable glusterdump.153904.dump.1517104561 | grep >>> purge_size >>> xlator.mount.fuse.itable.purge_size=0 >>> # grep itable glusterdump.153904.dump.1517169361 | grep >>> purge_size >>> xlator.mount.fuse.itable.purge_size=0 >>> # grep itable glusterdump.153904.dump.1517234161 | grep >>> purge_size >>> xlator.mount.fuse.itable.purge_size=0 >>> >>> Cheers, >>> >>> Dan >>> >>> >>> >>> I've CC'd the fuse/ dht devs to see if these data >>> types have potential >>> leaks. Could you raise a bug with the volume info >>> and a (dropbox?) link >>> from which we can download the dumps? You can >>> remove/replace the >>> filepaths from them. >>> >>> Regards. >>> Ravi >>> >>> >>> Cheers! >>> >>> Dan >>> >>> >>> Is there potentially something >>> misconfigured here? >>> >>> I did see a reference to a memory leak >>> in another thread in this >>> list, but that had to do with the >>> setting of quotas, I don't have >>> any quotas set on my system. >>> >>> Thanks, >>> >>> Dan Ragle >>> [email protected] >>> >>> On 1/25/2018 11:04 AM, Dan Ragle wrote: >>> >>> Having a memory issue with Gluster >>> 3.12.4 and not sure how to >>> troubleshoot. I don't *think* this >>> is expected behavior. This is on an >>> updated CentOS 7 box. The setup is a >>> simple two node replicated layout >>> where the two nodes act as both >>> server and client. The volume in >>> question: Volume Name: GlusterWWW >>> Type: Replicate Volume ID: >>> 8e9b0e79-f309-4d9b-a5bb-45d065faaaa3 >>> Status: Started Snapshot Count: 0 >>> Number of Bricks: 1 x 2 = 2 >>> Transport-type: tcp Bricks: Brick1: >>> vs1dlan.mydomain.com:/glusterf >>> s_bricks/brick1/www >>> Brick2: >>> vs2dlan.mydomain.com:/glusterf >>> s_bricks/brick1/www >>> Options >>> Reconfigured: >>> nfs.disable: on >>> cluster.favorite-child-policy: mtime >>> transport.address-family: inet I had >>> some other performance options in >>> there, (increased cache-size, md >>> invalidation, etc) but stripped them >>> out in an attempt to isolate the >>> issue. Still got the problem without >>> them. The volume currently contains >>> over 1M files. When mounting the >>> volume, I get (among other things) a >>> process as such: >>> /usr/sbin/glusterfs >>> --volfile-server=localhost >>> --volfile-id=/GlusterWWW >>> /var/www This process begins with >>> little memory, but then as files are >>> accessed in the volume the memory >>> increases. I setup a script that >>> simply reads the files in the volume >>> one at a time (no writes). It's >>> been running on and off about 12 >>> hours now and the resident memory of >>> the above process is already at 7.5G >>> and continues to grow slowly. >>> If I >>> stop the test script the memory >>> stops growing, but does not reduce. >>> Restart the test script and the >>> memory begins slowly growing again. >>> This >>> is obviously a contrived app >>> environment. With my intended >>> application >>> load it takes about a week or so for >>> the memory to get high enough to >>> invoke the oom killer. Is there >>> potentially something misconfigured >>> here? Thanks, Dan Ragle >>> [email protected] >>> >>> >>> >>> >>> ______________________________ >>> _________________ >>> Gluster-users mailing list >>> [email protected] >>> <mailto:[email protected]> >>> http://lists.gluster.org/mailm >>> an/listinfo/gluster-users >>> <http://lists.gluster.org/mail >>> man/listinfo/gluster-users> >>> >>> ______________________________ >>> _________________ >>> Gluster-users mailing list >>> [email protected] >>> <mailto:[email protected]> >>> http://lists.gluster.org/mailm >>> an/listinfo/gluster-users >>> <http://lists.gluster.org/mail >>> man/listinfo/gluster-users> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> [email protected] >>> <mailto:[email protected]> >>> http://lists.gluster.org/mailm >>> an/listinfo/gluster-users >>> <http://lists.gluster.org/mail >>> man/listinfo/gluster-users> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >> Gluster-users mailing list >> [email protected] >> http://lists.gluster.org/mailman/listinfo/gluster-users >> >>
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
