Re: [Gluster-users] Run away memory with gluster mount
Hi Dan, It sounds like you might be running into [1]. The patch has been posted upstream and the fix should be in the next release. In the meantime, I'm afraid there is no way to get around this without restarting the process. Regards, Nithya [1]https://bugzilla.redhat.com/show_bug.cgi?id=1541264 On 2 February 2018 at 02:57, Dan Raglewrote: > > > On 1/30/2018 6:31 AM, Raghavendra Gowdappa wrote: > >> >> >> - Original Message - >> >>> From: "Dan Ragle" >>> To: "Raghavendra Gowdappa" , "Ravishankar N" < >>> ravishan...@redhat.com> >>> Cc: gluster-users@gluster.org, "Csaba Henk" , "Niels >>> de Vos" , "Nithya >>> Balachandran" >>> Sent: Monday, January 29, 2018 9:02:21 PM >>> Subject: Re: [Gluster-users] Run away memory with gluster mount >>> >>> >>> >>> On 1/29/2018 2:36 AM, Raghavendra Gowdappa wrote: >>> - Original Message - > From: "Ravishankar N" > To: "Dan Ragle" , gluster-users@gluster.org > Cc: "Csaba Henk" , "Niels de Vos" >, > "Nithya Balachandran" , > "Raghavendra Gowdappa" > Sent: Saturday, January 27, 2018 10:23:38 AM > Subject: Re: [Gluster-users] Run away memory with gluster mount > > > > On 01/27/2018 02:29 AM, Dan Ragle wrote: > >> >> On 1/25/2018 8:21 PM, Ravishankar N wrote: >> >>> >>> >>> On 01/25/2018 11:04 PM, Dan Ragle wrote: >>> *sigh* trying again to correct formatting ... apologize for the earlier mess. Having a memory issue with Gluster 3.12.4 and not sure how to troubleshoot. I don't *think* this is expected behavior. This is on an updated CentOS 7 box. The setup is a simple two node replicated layout where the two nodes act as both server and client. The volume in question: Volume Name: GlusterWWW Type: Replicate Volume ID: 8e9b0e79-f309-4d9b-a5bb-45d065f3 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www Brick2: vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www Options Reconfigured: nfs.disable: on cluster.favorite-child-policy: mtime transport.address-family: inet I had some other performance options in there, (increased cache-size, md invalidation, etc) but stripped them out in an attempt to isolate the issue. Still got the problem without them. The volume currently contains over 1M files. When mounting the volume, I get (among other things) a process as such: /usr/sbin/glusterfs --volfile-server=localhost --volfile-id=/GlusterWWW /var/www This process begins with little memory, but then as files are accessed in the volume the memory increases. I setup a script that simply reads the files in the volume one at a time (no writes). It's been running on and off about 12 hours now and the resident memory of the above process is already at 7.5G and continues to grow slowly. If I stop the test script the memory stops growing, but does not reduce. Restart the test script and the memory begins slowly growing again. This is obviously a contrived app environment. With my intended application load it takes about a week or so for the memory to get high enough to invoke the oom killer. >>> >>> Can you try debugging with the statedump >>> (https://gluster.readthedocs.io/en/latest/Troubleshooting/st >>> atedump/#read-a-statedump) >>> of >>> the fuse mount process and see what member is leaking? Take the >>> statedumps in succession, maybe once initially during the I/O and >>> once the memory gets high enough to hit the OOM mark. >>> Share the dumps here. >>> >>> Regards, >>> Ravi >>> >> >> Thanks for the reply. I noticed yesterday that an update (3.12.5) had >> been posted so I went ahead and updated and repeated the test >> overnight. The memory usage does not appear to be growing as quickly >> as is was with 3.12.4, but does still appear to be growing. >> >> I should also mention that there is another process beyond my test app >> that is reading the files from the volume. Specifically, there is an >> rsync that runs from the second node 2-4 times an hour that reads from >> the GlusterWWW volume mounted on node 1. Since none of the files in >> that mount
[Gluster-users] Release 3.12.6: Scheduled for the 12th of February
Hi, It's time to prepare the 3.12.6 release, which falls on the 10th of each month, and hence would be 12-02-2018 this time around. This mail is to call out the following, 1) Are there any pending *blocker* bugs that need to be tracked for 3.12.6? If so mark them against the provided tracker [1] as blockers for the release, or at the very least post them as a response to this mail 2) Pending reviews in the 3.12 dashboard will be part of the release, *iff* they pass regressions and have the review votes, so use the dashboard [2] to check on the status of your patches to 3.12 and get these going 3) I have made checks on what went into 3.10 post 3.12 release and if these fixes are already included in 3.12 branch, then status on this is *green* as all fixes ported to 3.10, are ported to 3.12 as well. Thanks, Jiffin [1] Release bug tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.12.6 [2] 3.12 review dashboard: https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:3-12-dashboard ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Tiered volume performance degrades badly after a volume stop/start or system restart.
The problem was simple, the sqlite3 DB connection parameters were only being set on a newly created DB, not when there was an existing DB. Apparently the sqlite3 DB default parameters are not ideal. Patch is in the Bug: 1540376 On Thu, Feb 1, 2018 at 9:32 AM, Jeff Byerswrote: > This problem appears to be related to the sqlite3 DB files > that are used for the tiering file access counters, stored on > each hot and cold tier brick in .glusterfs/.db. > > When the tier is first created, these DB files do not exist, > they are created, and everything works fine. > > On a stop/start or service restart, the .db files are already > present, albeit empty since I don't have cluster.write-freq- > threshold nor cluster.read-freq-threshold set, so > features.record-counters is off and nothing should be going > into the DB. > > I've found that if I delete these .db files after the volume > stop, but before the volume start, the tiering performance is > normal, not degraded. Of course all of the history in these DB > files is lost. Not sure what other ramifications there are to > deleting these .db files. > > When I did have one of the freq-threshold settings set, I did > see a record get added to the file, so the sqlite3 DB is > working to some degree. > > The sqlite3 version I have installed is sqlite-3.6.20- > 1.el6_7.2.x86_64. > > On Tue, Jan 30, 2018 at 10:17 PM, Vlad Kopylov wrote: >> Tested it in two different environments lately with exactly same results. >> Was trying to get better read performance from local mounts with >> hundreds of thousands maildir email files by using SSD, >>hoping that .gluster file stat read will improve which does migrate >> to hot tire. >> After seeing what you described for 24 hours and confirming all move >> around on the tires is done - killed it. >> Here are my volume settings - maybe will be useful to spot conflicting ones. >> >> cluster.shd-max-threads: 12 >> performance.rda-cache-limit: 128MB >> cluster.readdir-optimize: on >> cluster.read-hash-mode: 0 >> performance.strict-o-direct: on >> cluster.lookup-unhashed: auto >> performance.nl-cache: on >> performance.nl-cache-timeout: 600 >> cluster.lookup-optimize: on >> client.event-threads: 8 >> performance.client-io-threads: on >> performance.md-cache-timeout: 600 >> server.event-threads: 8 >> features.cache-invalidation: on >> features.cache-invalidation-timeout: 600 >> performance.stat-prefetch: on >> performance.cache-invalidation: on >> network.inode-lru-limit: 9 >> performance.cache-refresh-timeout: 10 >> performance.enable-least-priority: off >> performance.cache-size: 2GB >> cluster.nufa: on >> cluster.choose-local: on >> server.outstanding-rpc-limit: 128 >> >> fuse mounting >> defaults,_netdev,negative-timeout=10,attribute-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5 >> >> On Tue, Jan 30, 2018 at 6:29 PM, Jeff Byers wrote: >>> I am fighting this issue: >>> >>> Bug 1540376 – Tiered volume performance degrades badly after a >>> volume stop/start or system restart. >>> https://bugzilla.redhat.com/show_bug.cgi?id=1540376 >>> >>> Does anyone have any ideas on what might be causing this, and >>> what a fix or work-around might be? >>> >>> Thanks! >>> >>> ~ Jeff Byers ~ >>> >>> Tiered volume performance degrades badly after a volume >>> stop/start or system restart. >>> >>> The degradation is very significant, making the performance of >>> an SSD hot tiered volume a fraction of what it was with the >>> HDD before tiering. >>> >>> Stopping and starting the tiered volume causes the problem to >>> exhibit. Stopping and starting the Gluster services also does. >>> >>> Nothing in the tier is being promoted or demoted, the volume >>> starts empty, a file is written, then read, then deleted. The >>> file(s) only ever exist on the hot tier. >>> >>> This affects GlusterFS FUSE mounts, and also NFSv3 NFS mounts. >>> The problem has been reproduced in two test lab environments. >>> The issue was first seen using GlusterFS 3.7.18, and retested >>> with the same result using GlusterFS 3.12.3. >>> >>> I'm using the default tiering settings, no adjustments. >>> >>> Nothing of any significance appears to be being reported in >>> the GlusterFS logs. >>> >>> Summary: >>> >>> Before SSD tiering, HDD performance on a FUSE mount was 130.87 >>> MB/sec writes, 128.53 MB/sec reads. >>> >>> After SSD tiering, performance on a FUSE mount was 199.99 >>> MB/sec writes, 257.28 MB/sec reads. >>> >>> After GlusterFS volume stop/start, SSD tiering performance on >>> FUSE mount was 35.81 MB/sec writes, 37.33 MB/sec reads. A very >>> significant reduction in performance. >>> >>> Detaching and reattaching the SSD tier restores the good >>> tiered performance. >>> >>> ~ Jeff Byers ~ >>> ___ >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users > > > >
Re: [Gluster-users] Run away memory with gluster mount
On 1/30/2018 6:31 AM, Raghavendra Gowdappa wrote: - Original Message - From: "Dan Ragle"To: "Raghavendra Gowdappa" , "Ravishankar N" Cc: gluster-users@gluster.org, "Csaba Henk" , "Niels de Vos" , "Nithya Balachandran" Sent: Monday, January 29, 2018 9:02:21 PM Subject: Re: [Gluster-users] Run away memory with gluster mount On 1/29/2018 2:36 AM, Raghavendra Gowdappa wrote: - Original Message - From: "Ravishankar N" To: "Dan Ragle" , gluster-users@gluster.org Cc: "Csaba Henk" , "Niels de Vos" , "Nithya Balachandran" , "Raghavendra Gowdappa" Sent: Saturday, January 27, 2018 10:23:38 AM Subject: Re: [Gluster-users] Run away memory with gluster mount On 01/27/2018 02:29 AM, Dan Ragle wrote: On 1/25/2018 8:21 PM, Ravishankar N wrote: On 01/25/2018 11:04 PM, Dan Ragle wrote: *sigh* trying again to correct formatting ... apologize for the earlier mess. Having a memory issue with Gluster 3.12.4 and not sure how to troubleshoot. I don't *think* this is expected behavior. This is on an updated CentOS 7 box. The setup is a simple two node replicated layout where the two nodes act as both server and client. The volume in question: Volume Name: GlusterWWW Type: Replicate Volume ID: 8e9b0e79-f309-4d9b-a5bb-45d065f3 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www Brick2: vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www Options Reconfigured: nfs.disable: on cluster.favorite-child-policy: mtime transport.address-family: inet I had some other performance options in there, (increased cache-size, md invalidation, etc) but stripped them out in an attempt to isolate the issue. Still got the problem without them. The volume currently contains over 1M files. When mounting the volume, I get (among other things) a process as such: /usr/sbin/glusterfs --volfile-server=localhost --volfile-id=/GlusterWWW /var/www This process begins with little memory, but then as files are accessed in the volume the memory increases. I setup a script that simply reads the files in the volume one at a time (no writes). It's been running on and off about 12 hours now and the resident memory of the above process is already at 7.5G and continues to grow slowly. If I stop the test script the memory stops growing, but does not reduce. Restart the test script and the memory begins slowly growing again. This is obviously a contrived app environment. With my intended application load it takes about a week or so for the memory to get high enough to invoke the oom killer. Can you try debugging with the statedump (https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/#read-a-statedump) of the fuse mount process and see what member is leaking? Take the statedumps in succession, maybe once initially during the I/O and once the memory gets high enough to hit the OOM mark. Share the dumps here. Regards, Ravi Thanks for the reply. I noticed yesterday that an update (3.12.5) had been posted so I went ahead and updated and repeated the test overnight. The memory usage does not appear to be growing as quickly as is was with 3.12.4, but does still appear to be growing. I should also mention that there is another process beyond my test app that is reading the files from the volume. Specifically, there is an rsync that runs from the second node 2-4 times an hour that reads from the GlusterWWW volume mounted on node 1. Since none of the files in that mount are changing it doesn't actually rsync anything, but nonetheless it is running and reading the files in addition to my test script. (It's a part of my intended production setup that I forgot was still running.) The mount process appears to be gaining memory at a rate of about 1GB every 4 hours or so. At that rate it'll take several days before it runs the box out of memory. But I took your suggestion and made some statedumps today anyway, about 2 hours apart, 4 total so far. It looks like there may already be some actionable information. These are the only registers where the num_allocs have grown with each of the four samples: [mount/fuse.fuse - usage-type gf_fuse_mt_gids_t memusage] ---> num_allocs at Fri Jan 26 08:57:31 2018: 784 ---> num_allocs at Fri Jan 26 10:55:50 2018: 831 ---> num_allocs at Fri Jan 26 12:55:15 2018: 877 ---> num_allocs at Fri Jan 26 14:58:27 2018: 908 [mount/fuse.fuse - usage-type gf_common_mt_fd_lk_ctx_t memusage] ---> num_allocs at Fri Jan 26 08:57:31 2018: 5 ---> num_allocs at Fri Jan 26 10:55:50 2018: 10 ---> num_allocs at Fri Jan 26 12:55:15 2018: 15 ---> num_allocs at Fri Jan 26 14:58:27 2018: 17 [cluster/distribute.GlusterWWW-dht -
[Gluster-users] Gluster Monthly Newsletter, January 2018
Gluster Monthly Newsletter, January 2018 4.0 is coming! We’re currently tracking for a 4.0 release at the end of February, which means our next edition will be all about 4.0! This weekend, we have a busy schedule at FOSDEM with a Software Defined Storage DevRoom on Sunday - https://fosdem.org/2018/schedule/track/software_defined_storage/ with Gluster-4.0 and GD2 - Learn what's in Gluster's future https://fosdem.org/2018/schedule/event/gluster4_and_gd2/ Performance: Optimizing Software Defined Storage for the Age of Flash https://fosdem.org/2018/schedule/event/optimizing_sds/ We also have a Gluster stand in the main halls, come find us! Event planning for next year: As part of the Community Working Group issue queue, we’re asking for recommendations of events that Gluster should be focusing on. https://github.com/gluster/community/issues/7 has more details, we’re planning from March 2018 through March 2019 and we’d like your input! Where should Gluster be at? New Event announcement! Announcing Glustered 2018 in Bologna (IT) http://lists.gluster.org/pipermail/gluster-users/2017-December/033117.html Top Contributing Companies: Red Hat, Gluster, Inc., Facebook Noteworthy threads: [Gluster-users] 2018 - Plans and Expectations on Gluster Community http://lists.gluster.org/pipermail/gluster-users/2018-January/033144.html [Gluster-users] Integration of GPU with glusterfs http://lists.gluster.org/pipermail/gluster-users/2018-January/033206.html [Gluster-users] IMP: Release 4.0: CentOS 6 packages will not be made available http://lists.gluster.org/pipermail/gluster-users/2018-January/033212.html [Gluster-users] Community NetBSD regression tests EOL'd http://lists.gluster.org/pipermail/gluster-users/2018-January/033214.html [Gluster-devel] Release 4.0: Making it happen! http://lists.gluster.org/pipermail/gluster-devel/2018-January/054164.html [Gluster-devel] GD 2 xlator option changes http://lists.gluster.org/pipermail/gluster-devel/2018-January/054214.html [Gluster-devel] Regression tests time http://lists.gluster.org/pipermail/gluster-devel/2018-January/054289.html Upcoming CFPs: Glustered with Incontro DevOps - CfP now open! - event is March 8, 2018 www.incontrodevops.it/events/glustered-2018/ LinuxCon China - March 4, 2018 https://www.lfasiallc.com/linuxcon-containercon-cloudopen-china/cfp -- Amye Scavarda | a...@redhat.com | Gluster Community Lead ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Tiered volume performance degrades badly after a volume stop/start or system restart.
This problem appears to be related to the sqlite3 DB files that are used for the tiering file access counters, stored on each hot and cold tier brick in .glusterfs/.db. When the tier is first created, these DB files do not exist, they are created, and everything works fine. On a stop/start or service restart, the .db files are already present, albeit empty since I don't have cluster.write-freq- threshold nor cluster.read-freq-threshold set, so features.record-counters is off and nothing should be going into the DB. I've found that if I delete these .db files after the volume stop, but before the volume start, the tiering performance is normal, not degraded. Of course all of the history in these DB files is lost. Not sure what other ramifications there are to deleting these .db files. When I did have one of the freq-threshold settings set, I did see a record get added to the file, so the sqlite3 DB is working to some degree. The sqlite3 version I have installed is sqlite-3.6.20- 1.el6_7.2.x86_64. On Tue, Jan 30, 2018 at 10:17 PM, Vlad Kopylovwrote: > Tested it in two different environments lately with exactly same results. > Was trying to get better read performance from local mounts with > hundreds of thousands maildir email files by using SSD, >hoping that .gluster file stat read will improve which does migrate > to hot tire. > After seeing what you described for 24 hours and confirming all move > around on the tires is done - killed it. > Here are my volume settings - maybe will be useful to spot conflicting ones. > > cluster.shd-max-threads: 12 > performance.rda-cache-limit: 128MB > cluster.readdir-optimize: on > cluster.read-hash-mode: 0 > performance.strict-o-direct: on > cluster.lookup-unhashed: auto > performance.nl-cache: on > performance.nl-cache-timeout: 600 > cluster.lookup-optimize: on > client.event-threads: 8 > performance.client-io-threads: on > performance.md-cache-timeout: 600 > server.event-threads: 8 > features.cache-invalidation: on > features.cache-invalidation-timeout: 600 > performance.stat-prefetch: on > performance.cache-invalidation: on > network.inode-lru-limit: 9 > performance.cache-refresh-timeout: 10 > performance.enable-least-priority: off > performance.cache-size: 2GB > cluster.nufa: on > cluster.choose-local: on > server.outstanding-rpc-limit: 128 > > fuse mounting > defaults,_netdev,negative-timeout=10,attribute-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5 > > On Tue, Jan 30, 2018 at 6:29 PM, Jeff Byers wrote: >> I am fighting this issue: >> >> Bug 1540376 – Tiered volume performance degrades badly after a >> volume stop/start or system restart. >> https://bugzilla.redhat.com/show_bug.cgi?id=1540376 >> >> Does anyone have any ideas on what might be causing this, and >> what a fix or work-around might be? >> >> Thanks! >> >> ~ Jeff Byers ~ >> >> Tiered volume performance degrades badly after a volume >> stop/start or system restart. >> >> The degradation is very significant, making the performance of >> an SSD hot tiered volume a fraction of what it was with the >> HDD before tiering. >> >> Stopping and starting the tiered volume causes the problem to >> exhibit. Stopping and starting the Gluster services also does. >> >> Nothing in the tier is being promoted or demoted, the volume >> starts empty, a file is written, then read, then deleted. The >> file(s) only ever exist on the hot tier. >> >> This affects GlusterFS FUSE mounts, and also NFSv3 NFS mounts. >> The problem has been reproduced in two test lab environments. >> The issue was first seen using GlusterFS 3.7.18, and retested >> with the same result using GlusterFS 3.12.3. >> >> I'm using the default tiering settings, no adjustments. >> >> Nothing of any significance appears to be being reported in >> the GlusterFS logs. >> >> Summary: >> >> Before SSD tiering, HDD performance on a FUSE mount was 130.87 >> MB/sec writes, 128.53 MB/sec reads. >> >> After SSD tiering, performance on a FUSE mount was 199.99 >> MB/sec writes, 257.28 MB/sec reads. >> >> After GlusterFS volume stop/start, SSD tiering performance on >> FUSE mount was 35.81 MB/sec writes, 37.33 MB/sec reads. A very >> significant reduction in performance. >> >> Detaching and reattaching the SSD tier restores the good >> tiered performance. >> >> ~ Jeff Byers ~ >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users -- ~ Jeff Byers ~ ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] How to trigger a resync of a newly replaced empty brick in replicate config ?
Hi, Thanks. However "gluster v heal volname full" returned the following error message Commit failed on server4. Please check log file for details. I have checked the log files in /var/log/glusterfs on server4 (by grepping heal), but did not get any match. What should I be looking for and in which log file, please ? Note that there is currently a rebalance process running on the volume. Many thanks, A. On Thursday, 1 February 2018 17:32:19 CET Serkan Çoban wrote: > You do not need to reset brick if brick path does not change. Replace > the brick format and mount, then gluster v start volname force. > To start self heal just run gluster v heal volname full. > > On Thu, Feb 1, 2018 at 6:39 PM, Alessandro Ipewrote: > > Hi, > > > > > > My volume home is configured in replicate mode (version 3.12.4) with the > > bricks server1:/data/gluster/brick1 > > server2:/data/gluster/brick1 > > > > server2:/data/gluster/brick1 was corrupted, so I killed gluster daemon for > > that brick on server2, umounted it, reformated it, remounted it and did a> > >> gluster volume reset-brick home server2:/data/gluster/brick1 > >> server2:/data/gluster/brick1 commit force> > > I was expecting that the self-heal daemon would start copying data from > > server1:/data/gluster/brick1 (about 7.4 TB) to the empty > > server2:/data/gluster/brick1, which it only did for directories, but not > > for files. > > > > For the moment, I launched on the fuse mount point > > > >> find . | xargs stat > > > > but crawling the whole volume (100 TB) to trigger self-healing of a single > > brick of 7.4 TB is unefficient. > > > > Is there any trick to only self-heal a single brick, either by setting > > some attributes to its top directory, for example ? > > > > > > Many thanks, > > > > > > Alessandro > > > > > > ___ > > Gluster-users mailing list > > Gluster-users@gluster.org > > http://lists.gluster.org/mailman/listinfo/gluster-users -- Dr. Ir. Alessandro Ipe Department of Observations Tel. +32 2 373 06 31 Remote Sensing from Space Royal Meteorological Institute Avenue Circulaire 3Email: B-1180 BrusselsBelgium alessandro@meteo.be Web: http://gerb.oma.be ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] How to trigger a resync of a newly replaced empty brick in replicate config ?
You do not need to reset brick if brick path does not change. Replace the brick format and mount, then gluster v start volname force. To start self heal just run gluster v heal volname full. On Thu, Feb 1, 2018 at 6:39 PM, Alessandro Ipewrote: > Hi, > > > My volume home is configured in replicate mode (version 3.12.4) with the > bricks > server1:/data/gluster/brick1 > server2:/data/gluster/brick1 > > server2:/data/gluster/brick1 was corrupted, so I killed gluster daemon for > that brick on server2, umounted it, reformated it, remounted it and did a >> gluster volume reset-brick home server2:/data/gluster/brick1 >> server2:/data/gluster/brick1 commit force > > I was expecting that the self-heal daemon would start copying data from > server1:/data/gluster/brick1 > (about 7.4 TB) to the empty server2:/data/gluster/brick1, which it only did > for directories, but not for files. > > For the moment, I launched on the fuse mount point >> find . | xargs stat > but crawling the whole volume (100 TB) to trigger self-healing of a single > brick of 7.4 TB is unefficient. > > Is there any trick to only self-heal a single brick, either by setting some > attributes to its top directory, for example ? > > > Many thanks, > > > Alessandro > > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] How to trigger a resync of a newly replaced empty brick in replicate config ?
Hi, My volume home is configured in replicate mode (version 3.12.4) with the bricks server1:/data/gluster/brick1 server2:/data/gluster/brick1 server2:/data/gluster/brick1 was corrupted, so I killed gluster daemon for that brick on server2, umounted it, reformated it, remounted it and did a > gluster volume reset-brick home server2:/data/gluster/brick1 > server2:/data/gluster/brick1 commit force I was expecting that the self-heal daemon would start copying data from server1:/data/gluster/brick1 (about 7.4 TB) to the empty server2:/data/gluster/brick1, which it only did for directories, but not for files. For the moment, I launched on the fuse mount point > find . | xargs stat but crawling the whole volume (100 TB) to trigger self-healing of a single brick of 7.4 TB is unefficient. Is there any trick to only self-heal a single brick, either by setting some attributes to its top directory, for example ? Many thanks, Alessandro ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users