Re: [Gluster-users] Run away memory with gluster mount

2018-02-01 Thread Nithya Balachandran
Hi Dan,

It sounds like you might be running into [1]. The patch has been posted
upstream and the fix should be in the next release.
In the meantime, I'm afraid there is no way to get around this without
restarting the process.

Regards,
Nithya

[1]https://bugzilla.redhat.com/show_bug.cgi?id=1541264


On 2 February 2018 at 02:57, Dan Ragle  wrote:

>
>
> On 1/30/2018 6:31 AM, Raghavendra Gowdappa wrote:
>
>>
>>
>> - Original Message -
>>
>>> From: "Dan Ragle" 
>>> To: "Raghavendra Gowdappa" , "Ravishankar N" <
>>> ravishan...@redhat.com>
>>> Cc: gluster-users@gluster.org, "Csaba Henk" , "Niels
>>> de Vos" , "Nithya
>>> Balachandran" 
>>> Sent: Monday, January 29, 2018 9:02:21 PM
>>> Subject: Re: [Gluster-users] Run away memory with gluster mount
>>>
>>>
>>>
>>> On 1/29/2018 2:36 AM, Raghavendra Gowdappa wrote:
>>>


 - Original Message -

> From: "Ravishankar N" 
> To: "Dan Ragle" , gluster-users@gluster.org
> Cc: "Csaba Henk" , "Niels de Vos"  >,
> "Nithya Balachandran" ,
> "Raghavendra Gowdappa" 
> Sent: Saturday, January 27, 2018 10:23:38 AM
> Subject: Re: [Gluster-users] Run away memory with gluster mount
>
>
>
> On 01/27/2018 02:29 AM, Dan Ragle wrote:
>
>>
>> On 1/25/2018 8:21 PM, Ravishankar N wrote:
>>
>>>
>>>
>>> On 01/25/2018 11:04 PM, Dan Ragle wrote:
>>>
 *sigh* trying again to correct formatting ... apologize for the
 earlier mess.

 Having a memory issue with Gluster 3.12.4 and not sure how to
 troubleshoot. I don't *think* this is expected behavior.

 This is on an updated CentOS 7 box. The setup is a simple two node
 replicated layout where the two nodes act as both server and
 client.

 The volume in question:

 Volume Name: GlusterWWW
 Type: Replicate
 Volume ID: 8e9b0e79-f309-4d9b-a5bb-45d065f3
 Status: Started
 Snapshot Count: 0
 Number of Bricks: 1 x 2 = 2
 Transport-type: tcp
 Bricks:
 Brick1: vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www
 Brick2: vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www
 Options Reconfigured:
 nfs.disable: on
 cluster.favorite-child-policy: mtime
 transport.address-family: inet

 I had some other performance options in there, (increased
 cache-size, md invalidation, etc) but stripped them out in an
 attempt to
 isolate the issue. Still got the problem without them.

 The volume currently contains over 1M files.

 When mounting the volume, I get (among other things) a process as
 such:

 /usr/sbin/glusterfs --volfile-server=localhost
 --volfile-id=/GlusterWWW /var/www

 This process begins with little memory, but then as files are
 accessed in the volume the memory increases. I setup a script that
 simply reads the files in the volume one at a time (no writes). It's
 been running on and off about 12 hours now and the resident
 memory of the above process is already at 7.5G and continues to grow
 slowly. If I stop the test script the memory stops growing,
 but does not reduce. Restart the test script and the memory begins
 slowly growing again.

 This is obviously a contrived app environment. With my intended
 application load it takes about a week or so for the memory to get
 high enough to invoke the oom killer.

>>>
>>> Can you try debugging with the statedump
>>> (https://gluster.readthedocs.io/en/latest/Troubleshooting/st
>>> atedump/#read-a-statedump)
>>> of
>>> the fuse mount process and see what member is leaking? Take the
>>> statedumps in succession, maybe once initially during the I/O and
>>> once the memory gets high enough to hit the OOM mark.
>>> Share the dumps here.
>>>
>>> Regards,
>>> Ravi
>>>
>>
>> Thanks for the reply. I noticed yesterday that an update (3.12.5) had
>> been posted so I went ahead and updated and repeated the test
>> overnight. The memory usage does not appear to be growing as quickly
>> as is was with 3.12.4, but does still appear to be growing.
>>
>> I should also mention that there is another process beyond my test app
>> that is reading the files from the volume. Specifically, there is an
>> rsync that runs from the second node 2-4 times an hour that reads from
>> the GlusterWWW volume mounted on node 1. Since none of the files in
>> that mount 

[Gluster-users] Release 3.12.6: Scheduled for the 12th of February

2018-02-01 Thread Jiffin Tony Thottan

Hi,

It's time to prepare the 3.12.6 release, which falls on the 10th of
each month, and hence would be 12-02-2018 this time around.

This mail is to call out the following,

1) Are there any pending *blocker* bugs that need to be tracked for
3.12.6? If so mark them against the provided tracker [1] as blockers
for the release, or at the very least post them as a response to this
mail

2) Pending reviews in the 3.12 dashboard will be part of the release,
*iff* they pass regressions and have the review votes, so use the
dashboard [2] to check on the status of your patches to 3.12 and get
these going

3) I have made checks on what went into 3.10 post 3.12 release and if
these fixes are already included in 3.12 branch, then status on this is 
*green*

as all fixes ported to 3.10, are ported to 3.12 as well.

Thanks,
Jiffin

[1] Release bug tracker:
https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.12.6

[2] 3.12 review dashboard:
https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:3-12-dashboard 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Tiered volume performance degrades badly after a volume stop/start or system restart.

2018-02-01 Thread Jeff Byers
The problem was simple, the sqlite3 DB connection parameters
were only being set on a newly created DB, not when there was
an existing DB. Apparently the sqlite3 DB default parameters
are not ideal. Patch is in the Bug: 1540376

On Thu, Feb 1, 2018 at 9:32 AM, Jeff Byers  wrote:
> This problem appears to be related to the sqlite3 DB files
> that are used for the tiering file access counters, stored on
> each hot and cold tier brick in .glusterfs/.db.
>
> When the tier is first created, these DB files do not exist,
> they are created, and everything works fine.
>
> On a stop/start or service restart, the .db files are already
> present, albeit empty since I don't have cluster.write-freq-
> threshold nor cluster.read-freq-threshold set, so
> features.record-counters is off and nothing should be going
> into the DB.
>
> I've found that if I delete these .db files after the volume
> stop, but before the volume start, the tiering performance is
> normal, not degraded. Of course all of the history in these DB
> files is lost. Not sure what other ramifications there are to
> deleting these .db files.
>
> When I did have one of the freq-threshold settings set, I did
> see a record get added to the file, so the sqlite3 DB is
> working to some degree.
>
> The sqlite3 version I have installed is sqlite-3.6.20-
> 1.el6_7.2.x86_64.
>
> On Tue, Jan 30, 2018 at 10:17 PM, Vlad Kopylov  wrote:
>> Tested it in two different environments lately with exactly same results.
>> Was trying to get better read performance from local mounts with
>> hundreds of thousands maildir email files by using SSD,
>>hoping that .gluster file stat read will improve which does migrate
>> to hot tire.
>> After seeing what you described for 24 hours and confirming all move
>> around on the tires is done - killed it.
>> Here are my volume settings - maybe will be useful to spot conflicting ones.
>>
>> cluster.shd-max-threads: 12
>> performance.rda-cache-limit: 128MB
>> cluster.readdir-optimize: on
>> cluster.read-hash-mode: 0
>> performance.strict-o-direct: on
>> cluster.lookup-unhashed: auto
>> performance.nl-cache: on
>> performance.nl-cache-timeout: 600
>> cluster.lookup-optimize: on
>> client.event-threads: 8
>> performance.client-io-threads: on
>> performance.md-cache-timeout: 600
>> server.event-threads: 8
>> features.cache-invalidation: on
>> features.cache-invalidation-timeout: 600
>> performance.stat-prefetch: on
>> performance.cache-invalidation: on
>> network.inode-lru-limit: 9
>> performance.cache-refresh-timeout: 10
>> performance.enable-least-priority: off
>> performance.cache-size: 2GB
>> cluster.nufa: on
>> cluster.choose-local: on
>> server.outstanding-rpc-limit: 128
>>
>> fuse mounting 
>> defaults,_netdev,negative-timeout=10,attribute-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5
>>
>> On Tue, Jan 30, 2018 at 6:29 PM, Jeff Byers  wrote:
>>> I am fighting this issue:
>>>
>>>   Bug 1540376 – Tiered volume performance degrades badly after a
>>> volume stop/start or system restart.
>>>   https://bugzilla.redhat.com/show_bug.cgi?id=1540376
>>>
>>> Does anyone have any ideas on what might be causing this, and
>>> what a fix or work-around might be?
>>>
>>> Thanks!
>>>
>>> ~ Jeff Byers ~
>>>
>>> Tiered volume performance degrades badly after a volume
>>> stop/start or system restart.
>>>
>>> The degradation is very significant, making the performance of
>>> an SSD hot tiered volume a fraction of what it was with the
>>> HDD before tiering.
>>>
>>> Stopping and starting the tiered volume causes the problem to
>>> exhibit. Stopping and starting the Gluster services also does.
>>>
>>> Nothing in the tier is being promoted or demoted, the volume
>>> starts empty, a file is written, then read, then deleted. The
>>> file(s) only ever exist on the hot tier.
>>>
>>> This affects GlusterFS FUSE mounts, and also NFSv3 NFS mounts.
>>> The problem has been reproduced in two test lab environments.
>>> The issue was first seen using GlusterFS 3.7.18, and retested
>>> with the same result using GlusterFS 3.12.3.
>>>
>>> I'm using the default tiering settings, no adjustments.
>>>
>>> Nothing of any significance appears to be being reported in
>>> the GlusterFS logs.
>>>
>>> Summary:
>>>
>>> Before SSD tiering, HDD performance on a FUSE mount was 130.87
>>> MB/sec writes, 128.53 MB/sec reads.
>>>
>>> After SSD tiering, performance on a FUSE mount was 199.99
>>> MB/sec writes, 257.28 MB/sec reads.
>>>
>>> After GlusterFS volume stop/start, SSD tiering performance on
>>> FUSE mount was 35.81 MB/sec writes, 37.33 MB/sec reads. A very
>>> significant reduction in performance.
>>>
>>> Detaching and reattaching the SSD tier restores the good
>>> tiered performance.
>>>
>>> ~ Jeff Byers ~
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> 

Re: [Gluster-users] Run away memory with gluster mount

2018-02-01 Thread Dan Ragle



On 1/30/2018 6:31 AM, Raghavendra Gowdappa wrote:



- Original Message -

From: "Dan Ragle" 
To: "Raghavendra Gowdappa" , "Ravishankar N" 

Cc: gluster-users@gluster.org, "Csaba Henk" , "Niels de Vos" 
, "Nithya
Balachandran" 
Sent: Monday, January 29, 2018 9:02:21 PM
Subject: Re: [Gluster-users] Run away memory with gluster mount



On 1/29/2018 2:36 AM, Raghavendra Gowdappa wrote:



- Original Message -

From: "Ravishankar N" 
To: "Dan Ragle" , gluster-users@gluster.org
Cc: "Csaba Henk" , "Niels de Vos" ,
"Nithya Balachandran" ,
"Raghavendra Gowdappa" 
Sent: Saturday, January 27, 2018 10:23:38 AM
Subject: Re: [Gluster-users] Run away memory with gluster mount



On 01/27/2018 02:29 AM, Dan Ragle wrote:


On 1/25/2018 8:21 PM, Ravishankar N wrote:



On 01/25/2018 11:04 PM, Dan Ragle wrote:

*sigh* trying again to correct formatting ... apologize for the
earlier mess.

Having a memory issue with Gluster 3.12.4 and not sure how to
troubleshoot. I don't *think* this is expected behavior.

This is on an updated CentOS 7 box. The setup is a simple two node
replicated layout where the two nodes act as both server and
client.

The volume in question:

Volume Name: GlusterWWW
Type: Replicate
Volume ID: 8e9b0e79-f309-4d9b-a5bb-45d065f3
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: vs1dlan.mydomain.com:/glusterfs_bricks/brick1/www
Brick2: vs2dlan.mydomain.com:/glusterfs_bricks/brick1/www
Options Reconfigured:
nfs.disable: on
cluster.favorite-child-policy: mtime
transport.address-family: inet

I had some other performance options in there, (increased
cache-size, md invalidation, etc) but stripped them out in an
attempt to
isolate the issue. Still got the problem without them.

The volume currently contains over 1M files.

When mounting the volume, I get (among other things) a process as such:

/usr/sbin/glusterfs --volfile-server=localhost
--volfile-id=/GlusterWWW /var/www

This process begins with little memory, but then as files are
accessed in the volume the memory increases. I setup a script that
simply reads the files in the volume one at a time (no writes). It's
been running on and off about 12 hours now and the resident
memory of the above process is already at 7.5G and continues to grow
slowly. If I stop the test script the memory stops growing,
but does not reduce. Restart the test script and the memory begins
slowly growing again.

This is obviously a contrived app environment. With my intended
application load it takes about a week or so for the memory to get
high enough to invoke the oom killer.


Can you try debugging with the statedump
(https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/#read-a-statedump)
of
the fuse mount process and see what member is leaking? Take the
statedumps in succession, maybe once initially during the I/O and
once the memory gets high enough to hit the OOM mark.
Share the dumps here.

Regards,
Ravi


Thanks for the reply. I noticed yesterday that an update (3.12.5) had
been posted so I went ahead and updated and repeated the test
overnight. The memory usage does not appear to be growing as quickly
as is was with 3.12.4, but does still appear to be growing.

I should also mention that there is another process beyond my test app
that is reading the files from the volume. Specifically, there is an
rsync that runs from the second node 2-4 times an hour that reads from
the GlusterWWW volume mounted on node 1. Since none of the files in
that mount are changing it doesn't actually rsync anything, but
nonetheless it is running and reading the files in addition to my test
script. (It's a part of my intended production setup that I forgot was
still running.)

The mount process appears to be gaining memory at a rate of about 1GB
every 4 hours or so. At that rate it'll take several days before it
runs the box out of memory. But I took your suggestion and made some
statedumps today anyway, about 2 hours apart, 4 total so far. It looks
like there may already be some actionable information. These are the
only registers where the num_allocs have grown with each of the four
samples:

[mount/fuse.fuse - usage-type gf_fuse_mt_gids_t memusage]
   ---> num_allocs at Fri Jan 26 08:57:31 2018: 784
   ---> num_allocs at Fri Jan 26 10:55:50 2018: 831
   ---> num_allocs at Fri Jan 26 12:55:15 2018: 877
   ---> num_allocs at Fri Jan 26 14:58:27 2018: 908

[mount/fuse.fuse - usage-type gf_common_mt_fd_lk_ctx_t memusage]
   ---> num_allocs at Fri Jan 26 08:57:31 2018: 5
   ---> num_allocs at Fri Jan 26 10:55:50 2018: 10
   ---> num_allocs at Fri Jan 26 12:55:15 2018: 15
   ---> num_allocs at Fri Jan 26 14:58:27 2018: 17

[cluster/distribute.GlusterWWW-dht - 

[Gluster-users] Gluster Monthly Newsletter, January 2018

2018-02-01 Thread Amye Scavarda
Gluster Monthly Newsletter, January 2018

4.0 is coming!

We’re currently tracking for a 4.0 release at the end of February, which
means our next edition will be all about 4.0!

This weekend, we have a busy schedule at FOSDEM with a Software Defined
Storage DevRoom on Sunday -

https://fosdem.org/2018/schedule/track/software_defined_storage/ with

Gluster-4.0 and GD2 - Learn what's in Gluster's future

https://fosdem.org/2018/schedule/event/gluster4_and_gd2/

Performance:

Optimizing Software Defined Storage for the Age of Flash

https://fosdem.org/2018/schedule/event/optimizing_sds/

We also have a Gluster stand in the main halls, come find us!


Event planning for next year:

As part of the Community Working Group issue queue, we’re asking for
recommendations of events that Gluster should be focusing on.
https://github.com/gluster/community/issues/7 has more details, we’re
planning from March 2018 through March 2019 and we’d like your input! Where
should Gluster be at?


New Event announcement!

Announcing Glustered 2018 in Bologna (IT)

http://lists.gluster.org/pipermail/gluster-users/2017-December/033117.html


Top Contributing Companies:  Red Hat,  Gluster, Inc.,  Facebook


Noteworthy threads:

[Gluster-users] 2018 - Plans and Expectations on Gluster Community

http://lists.gluster.org/pipermail/gluster-users/2018-January/033144.html

[Gluster-users] Integration of GPU with glusterfs

http://lists.gluster.org/pipermail/gluster-users/2018-January/033206.html

[Gluster-users] IMP: Release 4.0: CentOS 6 packages will not be made
available

http://lists.gluster.org/pipermail/gluster-users/2018-January/033212.html

[Gluster-users] Community NetBSD regression tests EOL'd

http://lists.gluster.org/pipermail/gluster-users/2018-January/033214.html

[Gluster-devel] Release 4.0: Making it happen!

http://lists.gluster.org/pipermail/gluster-devel/2018-January/054164.html

[Gluster-devel] GD 2 xlator option changes

http://lists.gluster.org/pipermail/gluster-devel/2018-January/054214.html

[Gluster-devel] Regression tests time

http://lists.gluster.org/pipermail/gluster-devel/2018-January/054289.html


Upcoming CFPs:

Glustered with Incontro DevOps - CfP now open! - event is March 8, 2018

www.incontrodevops.it/events/glustered-2018/

LinuxCon China - March 4, 2018

https://www.lfasiallc.com/linuxcon-containercon-cloudopen-china/cfp


-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Tiered volume performance degrades badly after a volume stop/start or system restart.

2018-02-01 Thread Jeff Byers
This problem appears to be related to the sqlite3 DB files
that are used for the tiering file access counters, stored on
each hot and cold tier brick in .glusterfs/.db.

When the tier is first created, these DB files do not exist,
they are created, and everything works fine.

On a stop/start or service restart, the .db files are already
present, albeit empty since I don't have cluster.write-freq-
threshold nor cluster.read-freq-threshold set, so
features.record-counters is off and nothing should be going
into the DB.

I've found that if I delete these .db files after the volume
stop, but before the volume start, the tiering performance is
normal, not degraded. Of course all of the history in these DB
files is lost. Not sure what other ramifications there are to
deleting these .db files.

When I did have one of the freq-threshold settings set, I did
see a record get added to the file, so the sqlite3 DB is
working to some degree.

The sqlite3 version I have installed is sqlite-3.6.20-
1.el6_7.2.x86_64.

On Tue, Jan 30, 2018 at 10:17 PM, Vlad Kopylov  wrote:
> Tested it in two different environments lately with exactly same results.
> Was trying to get better read performance from local mounts with
> hundreds of thousands maildir email files by using SSD,
>hoping that .gluster file stat read will improve which does migrate
> to hot tire.
> After seeing what you described for 24 hours and confirming all move
> around on the tires is done - killed it.
> Here are my volume settings - maybe will be useful to spot conflicting ones.
>
> cluster.shd-max-threads: 12
> performance.rda-cache-limit: 128MB
> cluster.readdir-optimize: on
> cluster.read-hash-mode: 0
> performance.strict-o-direct: on
> cluster.lookup-unhashed: auto
> performance.nl-cache: on
> performance.nl-cache-timeout: 600
> cluster.lookup-optimize: on
> client.event-threads: 8
> performance.client-io-threads: on
> performance.md-cache-timeout: 600
> server.event-threads: 8
> features.cache-invalidation: on
> features.cache-invalidation-timeout: 600
> performance.stat-prefetch: on
> performance.cache-invalidation: on
> network.inode-lru-limit: 9
> performance.cache-refresh-timeout: 10
> performance.enable-least-priority: off
> performance.cache-size: 2GB
> cluster.nufa: on
> cluster.choose-local: on
> server.outstanding-rpc-limit: 128
>
> fuse mounting 
> defaults,_netdev,negative-timeout=10,attribute-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5
>
> On Tue, Jan 30, 2018 at 6:29 PM, Jeff Byers  wrote:
>> I am fighting this issue:
>>
>>   Bug 1540376 – Tiered volume performance degrades badly after a
>> volume stop/start or system restart.
>>   https://bugzilla.redhat.com/show_bug.cgi?id=1540376
>>
>> Does anyone have any ideas on what might be causing this, and
>> what a fix or work-around might be?
>>
>> Thanks!
>>
>> ~ Jeff Byers ~
>>
>> Tiered volume performance degrades badly after a volume
>> stop/start or system restart.
>>
>> The degradation is very significant, making the performance of
>> an SSD hot tiered volume a fraction of what it was with the
>> HDD before tiering.
>>
>> Stopping and starting the tiered volume causes the problem to
>> exhibit. Stopping and starting the Gluster services also does.
>>
>> Nothing in the tier is being promoted or demoted, the volume
>> starts empty, a file is written, then read, then deleted. The
>> file(s) only ever exist on the hot tier.
>>
>> This affects GlusterFS FUSE mounts, and also NFSv3 NFS mounts.
>> The problem has been reproduced in two test lab environments.
>> The issue was first seen using GlusterFS 3.7.18, and retested
>> with the same result using GlusterFS 3.12.3.
>>
>> I'm using the default tiering settings, no adjustments.
>>
>> Nothing of any significance appears to be being reported in
>> the GlusterFS logs.
>>
>> Summary:
>>
>> Before SSD tiering, HDD performance on a FUSE mount was 130.87
>> MB/sec writes, 128.53 MB/sec reads.
>>
>> After SSD tiering, performance on a FUSE mount was 199.99
>> MB/sec writes, 257.28 MB/sec reads.
>>
>> After GlusterFS volume stop/start, SSD tiering performance on
>> FUSE mount was 35.81 MB/sec writes, 37.33 MB/sec reads. A very
>> significant reduction in performance.
>>
>> Detaching and reattaching the SSD tier restores the good
>> tiered performance.
>>
>> ~ Jeff Byers ~
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users



-- 
~ Jeff Byers ~
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to trigger a resync of a newly replaced empty brick in replicate config ?

2018-02-01 Thread Alessandro Ipe
Hi,


Thanks. However "gluster v heal volname full" returned the following error 
message
Commit failed on server4. Please check log file for details.

I have checked the log files in /var/log/glusterfs on server4 (by grepping 
heal), but did not get any match. What should I be looking for and in which 
log file, please ?

Note that there is currently a rebalance process running on the volume.


Many thanks,


A. 


On Thursday, 1 February 2018 17:32:19 CET Serkan Çoban wrote:
> You do not need to reset brick if brick path does not change. Replace
> the brick format and mount, then gluster v start volname force.
> To start self heal just run gluster v heal volname full.
> 
> On Thu, Feb 1, 2018 at 6:39 PM, Alessandro Ipe  
wrote:
> > Hi,
> > 
> > 
> > My volume home is configured in replicate mode (version 3.12.4) with the
> > bricks server1:/data/gluster/brick1
> > server2:/data/gluster/brick1
> > 
> > server2:/data/gluster/brick1 was corrupted, so I killed gluster daemon for
> > that brick on server2, umounted it, reformated it, remounted it and did a> 
> >> gluster volume reset-brick home server2:/data/gluster/brick1
> >> server2:/data/gluster/brick1 commit force> 
> > I was expecting that the self-heal daemon would start copying data from
> > server1:/data/gluster/brick1 (about 7.4 TB) to the empty
> > server2:/data/gluster/brick1, which it only did for directories, but not
> > for files.
> > 
> > For the moment, I launched on the fuse mount point
> > 
> >> find . | xargs stat
> > 
> > but crawling the whole volume (100 TB) to trigger self-healing of a single
> > brick of 7.4 TB is unefficient.
> > 
> > Is there any trick to only self-heal a single brick, either by setting
> > some attributes to its top directory, for example ?
> > 
> > 
> > Many thanks,
> > 
> > 
> > Alessandro
> > 
> > 
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users


-- 

 Dr. Ir. Alessandro Ipe   
 Department of Observations Tel. +32 2 373 06 31
 Remote Sensing from Space
 Royal Meteorological Institute  
 Avenue Circulaire 3Email:  
 B-1180 BrusselsBelgium alessandro@meteo.be 
 Web: http://gerb.oma.be   



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] How to trigger a resync of a newly replaced empty brick in replicate config ?

2018-02-01 Thread Serkan Çoban
You do not need to reset brick if brick path does not change. Replace
the brick format and mount, then gluster v start volname force.
To start self heal just run gluster v heal volname full.

On Thu, Feb 1, 2018 at 6:39 PM, Alessandro Ipe  wrote:
> Hi,
>
>
> My volume home is configured in replicate mode (version 3.12.4) with the 
> bricks
> server1:/data/gluster/brick1
> server2:/data/gluster/brick1
>
> server2:/data/gluster/brick1 was corrupted, so I killed gluster daemon for 
> that brick on server2, umounted it, reformated it, remounted it and did a
>> gluster volume reset-brick home server2:/data/gluster/brick1 
>> server2:/data/gluster/brick1 commit force
>
> I was expecting that the self-heal daemon would start copying data from 
> server1:/data/gluster/brick1
> (about 7.4 TB) to the empty server2:/data/gluster/brick1, which it only did 
> for directories, but not for files.
>
> For the moment, I launched on the fuse mount point
>> find . | xargs stat
> but crawling the whole volume (100 TB) to trigger self-healing of a single 
> brick of 7.4 TB is unefficient.
>
> Is there any trick to only self-heal a single brick, either by setting some 
> attributes to its top directory, for example ?
>
>
> Many thanks,
>
>
> Alessandro
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] How to trigger a resync of a newly replaced empty brick in replicate config ?

2018-02-01 Thread Alessandro Ipe
Hi,


My volume home is configured in replicate mode (version 3.12.4) with the bricks
server1:/data/gluster/brick1
server2:/data/gluster/brick1

server2:/data/gluster/brick1 was corrupted, so I killed gluster daemon for that 
brick on server2, umounted it, reformated it, remounted it and did a
> gluster volume reset-brick home server2:/data/gluster/brick1 
> server2:/data/gluster/brick1 commit force

I was expecting that the self-heal daemon would start copying data from 
server1:/data/gluster/brick1 
(about 7.4 TB) to the empty server2:/data/gluster/brick1, which it only did for 
directories, but not for files. 

For the moment, I launched on the fuse mount point
> find . | xargs stat
but crawling the whole volume (100 TB) to trigger self-healing of a single 
brick of 7.4 TB is unefficient.

Is there any trick to only self-heal a single brick, either by setting some 
attributes to its top directory, for example ?


Many thanks,


Alessandro


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users