Re: [Gluster-users] Slow write times to gluster disk

2018-06-20 Thread Raghavendra Gowdappa
On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa 
wrote:

> For the case of writes to glusterfs mount,
>
> I saw in earlier conversations that there are too many lookups, but small
> number of writes. Since writes cached in write-behind would invalidate
> metadata cache, lookups won't be absorbed by md-cache. I am wondering what
> would results look like if we turn off performance.write-behind.
>
> @Pat,
>
> Can you set,
>
> # gluster volume set  performance.write-behind off
>

Please turn on "group metadata-cache" for write tests too.


> and redo the tests writing to glusterfs mount? Let us know about the
> results you see.
>
> regards,
> Raghavendra
>
> On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa  > wrote:
>
>>
>>
>> On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa <
>> rgowd...@redhat.com> wrote:
>>
>>> For the case of reading from Glusterfs mount, read-ahead should help.
>>> However, we've known issues with read-ahead[1][2]. To work around these,
>>> can you try with,
>>>
>>> 1. Turn off performance.open-behind
>>> #gluster volume set  performance.open-behind off
>>>
>>> 2. enable group meta metadata-cache
>>> # gluster volume set  group metadata-cache
>>>
>>
>> [1]  https://bugzilla.redhat.com/show_bug.cgi?id=1084508
>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
>>
>>
>>>
>>>
>>> On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:
>>>

 Hi,

 We were recently revisiting our problems with the slowness of gluster
 writes (http://lists.gluster.org/pipermail/gluster-users/2017-April
 /030529.html). Specifically we were testing the suggestions in a
 recent post (http://lists.gluster.org/pipe
 rmail/gluster-users/2018-March/033699.html). The first two suggestions
 (specifying a negative-timeout in the mount settings or adding
 rpc-auth-allow-insecure to glusterd.vol) did not improve our performance,
 while setting "disperse.eager-lock off" provided a tiny (5%) speed-up.

 Some of the various tests we have tried earlier can be seen in the
 links below.  Do any of the above observations suggest what we could try
 next to either improve the speed or debug the issue?  Thanks

 http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
 http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html

 Pat

 --

 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 Pat Haley  Email:  pha...@mit.edu
 Center for Ocean Engineering   Phone:  (617) 253-6824
 Dept. of Mechanical EngineeringFax:(617) 253-8125
 MIT, Room 5-213http://web.mit.edu/phaley/www/
 77 Massachusetts Avenue
 Cambridge, MA  02139-4301

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2018-06-20 Thread Raghavendra Gowdappa
Please note that these suggestions are for native fuse mount.

On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa 
wrote:

> For the case of writes to glusterfs mount,
>
> I saw in earlier conversations that there are too many lookups, but small
> number of writes. Since writes cached in write-behind would invalidate
> metadata cache, lookups won't be absorbed by md-cache. I am wondering what
> would results look like if we turn off performance.write-behind.
>
> @Pat,
>
> Can you set,
>
> # gluster volume set  performance.write-behind off
>
> and redo the tests writing to glusterfs mount? Let us know about the
> results you see.
>
> regards,
> Raghavendra
>
> On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa  > wrote:
>
>>
>>
>> On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa <
>> rgowd...@redhat.com> wrote:
>>
>>> For the case of reading from Glusterfs mount, read-ahead should help.
>>> However, we've known issues with read-ahead[1][2]. To work around these,
>>> can you try with,
>>>
>>> 1. Turn off performance.open-behind
>>> #gluster volume set  performance.open-behind off
>>>
>>> 2. enable group meta metadata-cache
>>> # gluster volume set  group metadata-cache
>>>
>>
>> [1]  https://bugzilla.redhat.com/show_bug.cgi?id=1084508
>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
>>
>>
>>>
>>>
>>> On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:
>>>

 Hi,

 We were recently revisiting our problems with the slowness of gluster
 writes (http://lists.gluster.org/pipermail/gluster-users/2017-April
 /030529.html). Specifically we were testing the suggestions in a
 recent post (http://lists.gluster.org/pipe
 rmail/gluster-users/2018-March/033699.html). The first two suggestions
 (specifying a negative-timeout in the mount settings or adding
 rpc-auth-allow-insecure to glusterd.vol) did not improve our performance,
 while setting "disperse.eager-lock off" provided a tiny (5%) speed-up.

 Some of the various tests we have tried earlier can be seen in the
 links below.  Do any of the above observations suggest what we could try
 next to either improve the speed or debug the issue?  Thanks

 http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
 http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html

 Pat

 --

 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 Pat Haley  Email:  pha...@mit.edu
 Center for Ocean Engineering   Phone:  (617) 253-6824
 Dept. of Mechanical EngineeringFax:(617) 253-8125
 MIT, Room 5-213http://web.mit.edu/phaley/www/
 77 Massachusetts Avenue
 Cambridge, MA  02139-4301

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2018-06-20 Thread Raghavendra Gowdappa
For the case of writes to glusterfs mount,

I saw in earlier conversations that there are too many lookups, but small
number of writes. Since writes cached in write-behind would invalidate
metadata cache, lookups won't be absorbed by md-cache. I am wondering what
would results look like if we turn off performance.write-behind.

@Pat,

Can you set,

# gluster volume set  performance.write-behind off

and redo the tests writing to glusterfs mount? Let us know about the
results you see.

regards,
Raghavendra

On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa 
wrote:

>
>
> On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa  > wrote:
>
>> For the case of reading from Glusterfs mount, read-ahead should help.
>> However, we've known issues with read-ahead[1][2]. To work around these,
>> can you try with,
>>
>> 1. Turn off performance.open-behind
>> #gluster volume set  performance.open-behind off
>>
>> 2. enable group meta metadata-cache
>> # gluster volume set  group metadata-cache
>>
>
> [1]  https://bugzilla.redhat.com/show_bug.cgi?id=1084508
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
>
>
>>
>>
>> On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:
>>
>>>
>>> Hi,
>>>
>>> We were recently revisiting our problems with the slowness of gluster
>>> writes (http://lists.gluster.org/pipermail/gluster-users/2017-April
>>> /030529.html). Specifically we were testing the suggestions in a recent
>>> post (http://lists.gluster.org/pipermail/gluster-users/2018-March
>>> /033699.html). The first two suggestions (specifying a negative-timeout
>>> in the mount settings or adding rpc-auth-allow-insecure to glusterd.vol)
>>> did not improve our performance, while setting "disperse.eager-lock off"
>>> provided a tiny (5%) speed-up.
>>>
>>> Some of the various tests we have tried earlier can be seen in the links
>>> below.  Do any of the above observations suggest what we could try next to
>>> either improve the speed or debug the issue?  Thanks
>>>
>>> http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
>>> http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html
>>>
>>> Pat
>>>
>>> --
>>>
>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>> Pat Haley  Email:  pha...@mit.edu
>>> Center for Ocean Engineering   Phone:  (617) 253-6824
>>> Dept. of Mechanical EngineeringFax:(617) 253-8125
>>> MIT, Room 5-213http://web.mit.edu/phaley/www/
>>> 77 Massachusetts Avenue
>>> Cambridge, MA  02139-4301
>>>
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Need Help to get GEO Replication working - Error: Please check gsync config file. Unable to get statefile's name

2018-06-20 Thread Kotresh Hiremath Ravishankar
Hi Axel,

It's the latest. Ok, please share the geo-replication master and slave logs.

master location: /var/log/gluster/geo-replication
slave location: /var/log/glusterfs/geo-replication-slaves

Thanks,
Kotresh HR

On Tue, Jun 19, 2018 at 2:54 PM, Axel Gruber  wrote:

> Hello
>
> im using in 2 Debian Machines (Virtual) with Gluster from the repro
>
> root@glusters1:/# gluster --version
> glusterfs 4.1.0
> Repository revision: git://git.gluster.org/glusterfs.git
> Copyright (c) 2006-2016 Red Hat, Inc. 
>
>
>
>
>
> 
> Kontaktieren Sie mich direkt per Live Chat - einfach hier Klicken:
> 
> Mit freundlichen Grüßen
> Autohaus A. Gruber OHG
> Axel Gruber / Geschäftsführer
>
> Tel: 0807193200
> Fax: 0807193202
> E-Mail: a...@agm.de
> Internet: www.autohaus-gruber.net
>
> Ihr starker MAZDA und HYUNDAI Partner 4mal in der Region - einmal auch in
> Ihrer Nähe.
>
> Autohaus A. Gruber OHG, Gewerbepark Kaserne 10, 83278 Traunstein.
>
> HRA 8216 Amtsgericht Traunstein, Ust.-Id.-Nr. DE813812187
>
> Geschäftsführer: Axel Gruber, Anton Gruber
>
> Steuernummer: 141/151/51801
>
>
> Am Mo., 18. Juni 2018 um 11:30 Uhr schrieb Kotresh Hiremath Ravishankar <
> khire...@redhat.com>:
>
>> Hi Alex,
>>
>> Sorry, I lost the context.
>>
>> Which gluster version are you using?
>>
>> Thanks,
>> Kotresh HR
>>
>> On Sat, Jun 16, 2018 at 2:57 PM, Axel Gruber  wrote:
>>
>>> Hello
>>>
>>> i think its better to open a new Thread:
>>>
>>>
>>> I tryed to install Geo Replication again - setup SSH Key - prepared
>>> session Broker and so on (shown in the Manual)
>>>
>>> But i get this error:
>>>
>>> root@glusters1:~# gluster volume geo-replication gpool geo.com::geovol
>>> create force
>>> Please check gsync config file. Unable to get statefile's name
>>> geo-replication command failed
>>>
>>> I use "force" command because Slave Gluster is to small - but its empty
>>> - so whtiout Force i get:
>>>
>>> root@glusters1:~# gluster volume geo-replication gpool geo.com::geovol
>>> create
>>> Total disk size of master is greater than disk size of slave.
>>> Total available size of master is greater than available size of slave
>>> geo-replication command failed
>>>
>>> I also tryed to adjust Sice of GEO Volume  - so now GEO Volume is bigger
>>> then Master Volume - but still same Error.
>>>
>>>
>>> Can anyone help me to understand whats going wrong here ?
>>>
>>>
>>>
>>>
>>> 
>>> Kontaktieren Sie mich direkt per Live Chat - einfach hier Klicken:
>>> 
>>> Mit freundlichen Grüßen
>>> Autohaus A. Gruber OHG
>>> Axel Gruber / Geschäftsführer
>>>
>>> Tel: 0807193200
>>> Fax: 0807193202
>>> E-Mail: a...@agm.de
>>> Internet: www.autohaus-gruber.net
>>>
>>> Ihr starker MAZDA und HYUNDAI Partner 4mal in der Region - einmal auch
>>> in Ihrer Nähe.
>>>
>>> Autohaus A. Gruber OHG, Gewerbepark Kaserne 10, 83278 Traunstein.
>>>
>>> HRA 8216 Amtsgericht Traunstein, Ust.-Id.-Nr. DE813812187
>>>
>>> Geschäftsführer: Axel Gruber, Anton Gruber
>>>
>>> Steuernummer: 141/151/51801
>>>
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>>
>> --
>> Thanks and Regards,
>> Kotresh H R
>>
>


-- 
Thanks and Regards,
Kotresh H R
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-20 Thread Nithya Balachandran
Thank you. In the meantime, turning off parallel readdir should prevent the
first crash.


On 20 June 2018 at 21:42, mohammad kashif  wrote:

> Hi Nithya
>
> Thanks for the bug report. This new crash happened only once and only at
> one client in the last 6 days. I will let you know if it happened again or
> more frequently.
>
> Cheers
>
> Kashif
>
> On Wed, Jun 20, 2018 at 12:28 PM, Nithya Balachandran  > wrote:
>
>> Hi Mohammad,
>>
>> This is a different crash. How often does it happen?
>>
>>
>> We have managed to reproduce the first crash you reported and a bug has
>> been filed at [1].
>> We will work on a fix for this.
>>
>>
>> Regards,
>> Nithya
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593199
>>
>>
>> On 18 June 2018 at 14:09, mohammad kashif  wrote:
>>
>>> Hi
>>>
>>> Problem appeared again after few days. This time, the client
>>> is glusterfs-3.10.12-1.el6.x86_64 and performance.parallel-readdir is
>>> off. The log level was set to ERROR and I got this log at the time of crash
>>>
>>> [2018-06-14 08:45:43.551384] E [rpc-clnt.c:365:saved_frames_unwind]
>>> (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x153)[0x7fac2e66ce03]
>>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fac2e434867]
>>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fac2e43497e]
>>> (--> 
>>> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xa5)[0x7fac2e434a45]
>>> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x278)[0x7fac2e434d68]
>>> ) 0-atlasglust-client-4: forced unwinding frame type(GlusterFS 3.3)
>>> op(READDIRP(40)) called at 2018-06-14 08:45:43.483303 (xid=0x7553c7
>>>
>>> Core dump was enabled on client so it created a dump. It is here
>>>
>>> http://www-pnp.physics.ox.ac.uk/~mohammad
>>> /core.1002074
>>>
>>> I used a gdb trace using this command
>>>
>>> gdb /usr/sbin/glusterfs core.1002074 -ex bt -ex quit |& tee
>>> backtrace.log_18_16_1
>>>
>>>
>>> http://www-pnp.physics.ox.ac.uk/~mohammad
>>> 
>>> /backtrace.log_18_16_1
>>>
>>> I haven't used gdb much so let me know if you want me to run gdb in
>>> different manner.
>>>
>>> Thanks
>>>
>>> Kashif
>>>
>>>
>>> On Mon, Jun 18, 2018 at 6:27 AM, Raghavendra Gowdappa <
>>> rgowd...@redhat.com> wrote:
>>>


 On Mon, Jun 18, 2018 at 9:39 AM, Raghavendra Gowdappa <
 rgowd...@redhat.com> wrote:

>
>
> On Mon, Jun 18, 2018 at 8:11 AM, Raghavendra Gowdappa <
> rgowd...@redhat.com> wrote:
>
>> From the bt:
>>
>> #8  0x7f6ef977e6de in rda_readdirp (frame=0x7f6eec862320,
>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=357, off=2,
>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>> #9  0x7f6ef952db4c in dht_readdirp_cbk (frame=> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>> orig_entries=, xdata=0x7f6eec0085a0) at
>> dht-common.c:5388
>> #10 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862210,
>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>> #11 0x7f6ef952db4c in dht_readdirp_cbk (frame=> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>> orig_entries=, xdata=0x7f6eec0085a0) at
>> dht-common.c:5388
>> #12 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862100,
>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>> #13 0x7f6ef952db4c in dht_readdirp_cbk (frame=> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>> orig_entries=, xdata=0x7f6eec0085a0) at
>> dht-common.c:5388
>> #14 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ff0,
>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>> #15 0x7f6ef952db4c in dht_readdirp_cbk (frame=> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>> orig_entries=, xdata=0x7f6eec0085a0) at
>> dht-common.c:5388
>> #16 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ee0,
>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>> #17 0x7f6ef952db4c in dht_readdirp_cbk (frame=> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>> orig_entries=, xdata=0x7f6eec0085a0) at
>> dht-common.c:5388
>> #18 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861dd0,
>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>> #19 0x7f6ef952db4c in dht_readdirp_cbk (frame=> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>> orig_entries=, xdata=0x7f6eec0085a0) at
>> 

Re: [Gluster-users] Slow write times to gluster disk

2018-06-20 Thread Raghavendra Gowdappa
On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa 
wrote:

> For the case of reading from Glusterfs mount, read-ahead should help.
> However, we've known issues with read-ahead[1][2]. To work around these,
> can you try with,
>
> 1. Turn off performance.open-behind
> #gluster volume set  performance.open-behind off
>
> 2. enable group meta metadata-cache
> # gluster volume set  group metadata-cache
>

[1]  https://bugzilla.redhat.com/show_bug.cgi?id=1084508
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489


>
>
> On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:
>
>>
>> Hi,
>>
>> We were recently revisiting our problems with the slowness of gluster
>> writes (http://lists.gluster.org/pipermail/gluster-users/2017-April
>> /030529.html). Specifically we were testing the suggestions in a recent
>> post (http://lists.gluster.org/pipermail/gluster-users/2018-March
>> /033699.html). The first two suggestions (specifying a negative-timeout
>> in the mount settings or adding rpc-auth-allow-insecure to glusterd.vol)
>> did not improve our performance, while setting "disperse.eager-lock off"
>> provided a tiny (5%) speed-up.
>>
>> Some of the various tests we have tried earlier can be seen in the links
>> below.  Do any of the above observations suggest what we could try next to
>> either improve the speed or debug the issue?  Thanks
>>
>> http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
>> http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html
>>
>> Pat
>>
>> --
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>> Pat Haley  Email:  pha...@mit.edu
>> Center for Ocean Engineering   Phone:  (617) 253-6824
>> Dept. of Mechanical EngineeringFax:(617) 253-8125
>> MIT, Room 5-213http://web.mit.edu/phaley/www/
>> 77 Massachusetts Avenue
>> Cambridge, MA  02139-4301
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2018-06-20 Thread Raghavendra Gowdappa
For the case of reading from Glusterfs mount, read-ahead should help.
However, we've known issues with read-ahead[1][2]. To work around these,
can you try with,

1. Turn off performance.open-behind
#gluster volume set  performance.open-behind off

2. enable group meta metadata-cache
# gluster volume set  group metadata-cache


On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:

>
> Hi,
>
> We were recently revisiting our problems with the slowness of gluster
> writes (http://lists.gluster.org/pipermail/gluster-users/2017-April
> /030529.html). Specifically we were testing the suggestions in a recent
> post (http://lists.gluster.org/pipermail/gluster-users/2018-March
> /033699.html). The first two suggestions (specifying a negative-timeout
> in the mount settings or adding rpc-auth-allow-insecure to glusterd.vol)
> did not improve our performance, while setting "disperse.eager-lock off"
> provided a tiny (5%) speed-up.
>
> Some of the various tests we have tried earlier can be seen in the links
> below.  Do any of the above observations suggest what we could try next to
> either improve the speed or debug the issue?  Thanks
>
> http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
> http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html
>
> Pat
>
> --
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Pat Haley  Email:  pha...@mit.edu
> Center for Ocean Engineering   Phone:  (617) 253-6824
> Dept. of Mechanical EngineeringFax:(617) 253-8125
> MIT, Room 5-213http://web.mit.edu/phaley/www/
> 77 Massachusetts Avenue
> Cambridge, MA  02139-4301
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Slow write times to gluster disk

2018-06-20 Thread Pat Haley


Hi,

We were recently revisiting our problems with the slowness of gluster 
writes 
(http://lists.gluster.org/pipermail/gluster-users/2017-April/030529.html). 
Specifically we were testing the suggestions in a recent post 
(http://lists.gluster.org/pipermail/gluster-users/2018-March/033699.html). 
The first two suggestions (specifying a negative-timeout in the mount 
settings or adding rpc-auth-allow-insecure to glusterd.vol) did not 
improve our performance, while setting "disperse.eager-lock off" 
provided a tiny (5%) speed-up.


Some of the various tests we have tried earlier can be seen in the links 
below.  Do any of the above observations suggest what we could try next 
to either improve the speed or debug the issue?  Thanks


http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html

Pat

--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:  pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-20 Thread mohammad kashif
Hi Nithya

Thanks for the bug report. This new crash happened only once and only at
one client in the last 6 days. I will let you know if it happened again or
more frequently.

Cheers

Kashif

On Wed, Jun 20, 2018 at 12:28 PM, Nithya Balachandran 
wrote:

> Hi Mohammad,
>
> This is a different crash. How often does it happen?
>
>
> We have managed to reproduce the first crash you reported and a bug has
> been filed at [1].
> We will work on a fix for this.
>
>
> Regards,
> Nithya
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593199
>
>
> On 18 June 2018 at 14:09, mohammad kashif  wrote:
>
>> Hi
>>
>> Problem appeared again after few days. This time, the client
>> is glusterfs-3.10.12-1.el6.x86_64 and performance.parallel-readdir is
>> off. The log level was set to ERROR and I got this log at the time of crash
>>
>> [2018-06-14 08:45:43.551384] E [rpc-clnt.c:365:saved_frames_unwind] (-->
>> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x153)[0x7fac2e66ce03]
>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fac2e434867]
>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fac2e43497e]
>> (--> 
>> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xa5)[0x7fac2e434a45]
>> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x278)[0x7fac2e434d68]
>> ) 0-atlasglust-client-4: forced unwinding frame type(GlusterFS 3.3)
>> op(READDIRP(40)) called at 2018-06-14 08:45:43.483303 (xid=0x7553c7
>>
>> Core dump was enabled on client so it created a dump. It is here
>>
>> http://www-pnp.physics.ox.ac.uk/~mohammad
>> /core.1002074
>>
>> I used a gdb trace using this command
>>
>> gdb /usr/sbin/glusterfs core.1002074 -ex bt -ex quit |& tee
>> backtrace.log_18_16_1
>>
>>
>> http://www-pnp.physics.ox.ac.uk/~mohammad
>> 
>> /backtrace.log_18_16_1
>>
>> I haven't used gdb much so let me know if you want me to run gdb in
>> different manner.
>>
>> Thanks
>>
>> Kashif
>>
>>
>> On Mon, Jun 18, 2018 at 6:27 AM, Raghavendra Gowdappa <
>> rgowd...@redhat.com> wrote:
>>
>>>
>>>
>>> On Mon, Jun 18, 2018 at 9:39 AM, Raghavendra Gowdappa <
>>> rgowd...@redhat.com> wrote:
>>>


 On Mon, Jun 18, 2018 at 8:11 AM, Raghavendra Gowdappa <
 rgowd...@redhat.com> wrote:

> From the bt:
>
> #8  0x7f6ef977e6de in rda_readdirp (frame=0x7f6eec862320,
> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=357, off=2,
> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
> #9  0x7f6ef952db4c in dht_readdirp_cbk (frame= out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
> orig_entries=, xdata=0x7f6eec0085a0) at
> dht-common.c:5388
> #10 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862210,
> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
> #11 0x7f6ef952db4c in dht_readdirp_cbk (frame= out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
> orig_entries=, xdata=0x7f6eec0085a0) at
> dht-common.c:5388
> #12 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862100,
> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
> #13 0x7f6ef952db4c in dht_readdirp_cbk (frame= out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
> orig_entries=, xdata=0x7f6eec0085a0) at
> dht-common.c:5388
> #14 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ff0,
> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
> #15 0x7f6ef952db4c in dht_readdirp_cbk (frame= out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
> orig_entries=, xdata=0x7f6eec0085a0) at
> dht-common.c:5388
> #16 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ee0,
> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
> #17 0x7f6ef952db4c in dht_readdirp_cbk (frame= out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
> orig_entries=, xdata=0x7f6eec0085a0) at
> dht-common.c:5388
> #18 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861dd0,
> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
> #19 0x7f6ef952db4c in dht_readdirp_cbk (frame= out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
> orig_entries=, xdata=0x7f6eec0085a0) at
> dht-common.c:5388
> #20 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861cc0,
> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
> #21 0x7f6ef952db4c in dht_readdirp_cbk (frame= out>, 

[Gluster-users] Announcing GlusterFS release 4.1.0 (Long Term Maintenance)

2018-06-20 Thread Shyam Ranganathan
The Gluster community is pleased to announce the release of 4.1, our
latest long term supported release.

This is a major release that includes a range of features enhancing
management, performance, monitoring, and providing newer functionality
like thin arbiters, cloud archival, time consistency. It also contains
several bug fixes.

A selection of the important features and changes are documented on this
[1] page.

Announcements:

1. As 4.0 was a short term maintenance release, features which have been
included in that release are available with 4.1.0 as well. These
features may be of interest to users upgrading to 4.1.0 from older than
4.0 releases. The 4.0 release notes captures the list of features that
were introduced with 4.0.

NOTE: As 4.0 was a short term maintenance release, it will reach end of
life (EOL) with the release of 4.1.0. See, [2]

2. Releases that receive maintenance updates post 4.1 release are, 3.12,
and 4.1 (reference)

NOTE: 3.10 long term maintenance release, will reach end of life (EOL)
with the release of 4.1.0.  See, [2]

3. Continuing with this release, the CentOS storage SIG will not build
server packages for CentOS6. Server packages will be available for
CentOS7 only. For ease of migrations, client packages on CentOS6 will be
published and maintained. See, [3]

4. Minor updates for this release would be on the 20th of every month

References:
[1] Release notes: https://docs.gluster.org/en/latest/release-notes/4.1.0/

[2] Release schedule: https://www.gluster.org/release-schedule/

[3] CentOS6 server package deprecation:
http://lists.gluster.org/pipermail/gluster-users/2018-January/033212.html
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-20 Thread Nithya Balachandran
Hi Mohammad,

This is a different crash. How often does it happen?


We have managed to reproduce the first crash you reported and a bug has
been filed at [1].
We will work on a fix for this.


Regards,
Nithya

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1593199


On 18 June 2018 at 14:09, mohammad kashif  wrote:

> Hi
>
> Problem appeared again after few days. This time, the client
> is glusterfs-3.10.12-1.el6.x86_64 and performance.parallel-readdir is
> off. The log level was set to ERROR and I got this log at the time of crash
>
> [2018-06-14 08:45:43.551384] E [rpc-clnt.c:365:saved_frames_unwind] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x153)[0x7fac2e66ce03]
> (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fac2e434867]
> (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fac2e43497e]
> (--> 
> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xa5)[0x7fac2e434a45]
> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x278)[0x7fac2e434d68]
> ) 0-atlasglust-client-4: forced unwinding frame type(GlusterFS 3.3)
> op(READDIRP(40)) called at 2018-06-14 08:45:43.483303 (xid=0x7553c7
>
> Core dump was enabled on client so it created a dump. It is here
>
> http://www-pnp.physics.ox.ac.uk/~mohammad
> /core.1002074
>
> I used a gdb trace using this command
>
> gdb /usr/sbin/glusterfs core.1002074 -ex bt -ex quit |& tee
> backtrace.log_18_16_1
>
>
> http://www-pnp.physics.ox.ac.uk/~mohammad
> 
> /backtrace.log_18_16_1
>
> I haven't used gdb much so let me know if you want me to run gdb in
> different manner.
>
> Thanks
>
> Kashif
>
>
> On Mon, Jun 18, 2018 at 6:27 AM, Raghavendra Gowdappa  > wrote:
>
>>
>>
>> On Mon, Jun 18, 2018 at 9:39 AM, Raghavendra Gowdappa <
>> rgowd...@redhat.com> wrote:
>>
>>>
>>>
>>> On Mon, Jun 18, 2018 at 8:11 AM, Raghavendra Gowdappa <
>>> rgowd...@redhat.com> wrote:
>>>
 From the bt:

 #8  0x7f6ef977e6de in rda_readdirp (frame=0x7f6eec862320,
 this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=357, off=2,
 xdata=0x7f6eec0085a0) at readdir-ahead.c:266
 #9  0x7f6ef952db4c in dht_readdirp_cbk (frame=>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
 orig_entries=, xdata=0x7f6eec0085a0) at
 dht-common.c:5388
 #10 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862210,
 this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
 xdata=0x7f6eec0085a0) at readdir-ahead.c:266
 #11 0x7f6ef952db4c in dht_readdirp_cbk (frame=>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
 orig_entries=, xdata=0x7f6eec0085a0) at
 dht-common.c:5388
 #12 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862100,
 this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
 xdata=0x7f6eec0085a0) at readdir-ahead.c:266
 #13 0x7f6ef952db4c in dht_readdirp_cbk (frame=>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
 orig_entries=, xdata=0x7f6eec0085a0) at
 dht-common.c:5388
 #14 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ff0,
 this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
 xdata=0x7f6eec0085a0) at readdir-ahead.c:266
 #15 0x7f6ef952db4c in dht_readdirp_cbk (frame=>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
 orig_entries=, xdata=0x7f6eec0085a0) at
 dht-common.c:5388
 #16 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ee0,
 this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
 xdata=0x7f6eec0085a0) at readdir-ahead.c:266
 #17 0x7f6ef952db4c in dht_readdirp_cbk (frame=>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
 orig_entries=, xdata=0x7f6eec0085a0) at
 dht-common.c:5388
 #18 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861dd0,
 this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
 xdata=0x7f6eec0085a0) at readdir-ahead.c:266
 #19 0x7f6ef952db4c in dht_readdirp_cbk (frame=>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
 orig_entries=, xdata=0x7f6eec0085a0) at
 dht-common.c:5388
 #20 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861cc0,
 this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
 xdata=0x7f6eec0085a0) at readdir-ahead.c:266
 #21 0x7f6ef952db4c in dht_readdirp_cbk (frame=>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
 orig_entries=, xdata=0x7f6eec0085a0) at
 dht-common.c:5388
 #22 0x7f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861bb0,
 this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
 xdata=0x7f6eec0085a0) at readdir-ahead.c:266
 #23 0x7f6ef952db4c in dht_readdirp_cbk (frame=>>> out>, cookie=0x7f6ef4019f20, 

Re: [Gluster-users] Geo-Replication memory leak on slave node

2018-06-20 Thread Mark Betham
Hi Kotresh,

Many thanks for your prompt response.  No need to apologise, any help you
can provide is greatly appreciated.

I look forward to receiving your update next week.

Many thanks,

Mark Betham

On Wed, 20 Jun 2018 at 10:55, Kotresh Hiremath Ravishankar <
khire...@redhat.com> wrote:

> Hi Mark,
>
> Sorry, I was busy and could not take a serious look at the logs. I can
> update you on Monday.
>
> Thanks,
> Kotresh HR
>
> On Wed, Jun 20, 2018 at 12:32 PM, Mark Betham <
> mark.bet...@performancehorizon.com> wrote:
>
>> Hi Kotresh,
>>
>> I was wondering if you had made any progress with regards to the issue I
>> am currently experiencing with geo-replication.
>>
>> For info the fault remains and effectively requires a restart of the
>> geo-replication service on a daily basis to reclaim the used memory on the
>> slave node.
>>
>> If you require any further information then please do not hesitate to ask.
>>
>> Many thanks,
>>
>> Mark Betham
>>
>>
>> On Mon, 11 Jun 2018 at 08:24, Mark Betham <
>> mark.bet...@performancehorizon.com> wrote:
>>
>>> Hi Kotresh,
>>>
>>> Many thanks.  I will shortly setup a share on my GDrive and send the
>>> link directly to yourself.
>>>
>>> For Info;
>>> The Geo-Rep slave failed again over the weekend but it did not recover
>>> this time.  It looks to have become unresponsive at around 14:40 UTC on 9th
>>> June.  I have attached an image showing the mem usage and you can see from
>>> this when the system failed.  The system was totally unresponsive and
>>> required a cold power off and then power on in order to recover the server.
>>>
>>> Many thanks for your help.
>>>
>>> Mark Betham.
>>>
>>> On 11 June 2018 at 05:53, Kotresh Hiremath Ravishankar <
>>> khire...@redhat.com> wrote:
>>>
 Hi Mark,

 Google drive works for me.

 Thanks,
 Kotresh HR

 On Fri, Jun 8, 2018 at 3:00 PM, Mark Betham <
 mark.bet...@performancehorizon.com> wrote:

> Hi Kotresh,
>
> The memory issue re-occurred again.  This is indicating it will occur
> around once a day.
>
> Again no traceback listed in the log, the only update in the log was
> as follows;
> [2018-06-08 08:26:43.404261] I [resource(slave):1020:service_loop]
> GLUSTER: connection inactive, stopping timeout=120
> [2018-06-08 08:29:19.357615] I [syncdutils(slave):271:finalize] :
> exiting.
> [2018-06-08 08:31:02.432002] I [resource(slave):1502:connect] GLUSTER:
> Mounting gluster volume locally...
> [2018-06-08 08:31:03.716967] I [resource(slave):1515:connect] GLUSTER:
> Mounted gluster volume duration=1.2729
> [2018-06-08 08:31:03.717411] I [resource(slave):1012:service_loop]
> GLUSTER: slave listening
>
> I have attached an image showing the latest memory usage pattern.
>
> Can you please advise how I can pass the log data across to you?  As
> soon as I know this I will get the data uploaded for your review.
>
> Thanks,
>
> Mark Betham
>
>
>
>
> On 7 June 2018 at 08:19, Mark Betham <
> mark.bet...@performancehorizon.com> wrote:
>
>> Hi Kotresh,
>>
>> Many thanks for your prompt response.
>>
>> Below are my responses to your questions;
>>
>> 1. Is this trace back consistently hit? I just wanted to confirm
>> whether it's transient which occurs once in a while and gets back to 
>> normal?
>> It appears not.  As soon as the geo-rep recovered yesterday from the
>> high memory usage it immediately began rising again until it consumed all
>> of the available ram.  But this time nothing was committed to the log 
>> file.
>> I would like to add here that this current instance of geo-rep was
>> only brought online at the start of this week due to the issues with 
>> glibc
>> on CentOS 7.5.  This is the first time I have had geo-rep running with
>> Gluster ver 3.12.9, both storage clusters at each physical site were only
>> rebuilt approx. 4 weeks ago, due to the previous version in use going 
>> EOL.
>> Prior to this I had been running 3.13.2 (3.13.X now EOL) at each of the
>> sites and it is worth noting that the same behaviour was also seen on 
>> this
>> version of Gluster, unfortunately I do not have any of the log data from
>> then but I do not recall seeing any instances of the trace back message
>> mentioned.
>>
>> 2. Please upload the complete geo-rep logs from both master and slave.
>> I have the log files, just checking to make sure there is no
>> confidential info inside.  The logfiles are too big to send via email, 
>> even
>> when compressed.  Do you have a preferred method to allow me to share 
>> this
>> data with you or would a share from my Google drive be sufficient?
>>
>> 3. Are the gluster versions same across master and slave?
>> Yes, all gluster versions are the same across the two sites for all
>> 

Re: [Gluster-users] Geo-Replication memory leak on slave node

2018-06-20 Thread Kotresh Hiremath Ravishankar
Hi Mark,

Sorry, I was busy and could not take a serious look at the logs. I can
update you on Monday.

Thanks,
Kotresh HR

On Wed, Jun 20, 2018 at 12:32 PM, Mark Betham <
mark.bet...@performancehorizon.com> wrote:

> Hi Kotresh,
>
> I was wondering if you had made any progress with regards to the issue I
> am currently experiencing with geo-replication.
>
> For info the fault remains and effectively requires a restart of the
> geo-replication service on a daily basis to reclaim the used memory on the
> slave node.
>
> If you require any further information then please do not hesitate to ask.
>
> Many thanks,
>
> Mark Betham
>
>
> On Mon, 11 Jun 2018 at 08:24, Mark Betham  performancehorizon.com> wrote:
>
>> Hi Kotresh,
>>
>> Many thanks.  I will shortly setup a share on my GDrive and send the link
>> directly to yourself.
>>
>> For Info;
>> The Geo-Rep slave failed again over the weekend but it did not recover
>> this time.  It looks to have become unresponsive at around 14:40 UTC on 9th
>> June.  I have attached an image showing the mem usage and you can see from
>> this when the system failed.  The system was totally unresponsive and
>> required a cold power off and then power on in order to recover the server.
>>
>> Many thanks for your help.
>>
>> Mark Betham.
>>
>> On 11 June 2018 at 05:53, Kotresh Hiremath Ravishankar <
>> khire...@redhat.com> wrote:
>>
>>> Hi Mark,
>>>
>>> Google drive works for me.
>>>
>>> Thanks,
>>> Kotresh HR
>>>
>>> On Fri, Jun 8, 2018 at 3:00 PM, Mark Betham >> performancehorizon.com> wrote:
>>>
 Hi Kotresh,

 The memory issue re-occurred again.  This is indicating it will occur
 around once a day.

 Again no traceback listed in the log, the only update in the log was as
 follows;
 [2018-06-08 08:26:43.404261] I [resource(slave):1020:service_loop]
 GLUSTER: connection inactive, stopping timeout=120
 [2018-06-08 08:29:19.357615] I [syncdutils(slave):271:finalize] :
 exiting.
 [2018-06-08 08:31:02.432002] I [resource(slave):1502:connect] GLUSTER:
 Mounting gluster volume locally...
 [2018-06-08 08:31:03.716967] I [resource(slave):1515:connect] GLUSTER:
 Mounted gluster volume duration=1.2729
 [2018-06-08 08:31:03.717411] I [resource(slave):1012:service_loop]
 GLUSTER: slave listening

 I have attached an image showing the latest memory usage pattern.

 Can you please advise how I can pass the log data across to you?  As
 soon as I know this I will get the data uploaded for your review.

 Thanks,

 Mark Betham




 On 7 June 2018 at 08:19, Mark Betham >>> performancehorizon.com> wrote:

> Hi Kotresh,
>
> Many thanks for your prompt response.
>
> Below are my responses to your questions;
>
> 1. Is this trace back consistently hit? I just wanted to confirm
> whether it's transient which occurs once in a while and gets back to 
> normal?
> It appears not.  As soon as the geo-rep recovered yesterday from the
> high memory usage it immediately began rising again until it consumed all
> of the available ram.  But this time nothing was committed to the log 
> file.
> I would like to add here that this current instance of geo-rep was
> only brought online at the start of this week due to the issues with glibc
> on CentOS 7.5.  This is the first time I have had geo-rep running with
> Gluster ver 3.12.9, both storage clusters at each physical site were only
> rebuilt approx. 4 weeks ago, due to the previous version in use going EOL.
> Prior to this I had been running 3.13.2 (3.13.X now EOL) at each of the
> sites and it is worth noting that the same behaviour was also seen on this
> version of Gluster, unfortunately I do not have any of the log data from
> then but I do not recall seeing any instances of the trace back message
> mentioned.
>
> 2. Please upload the complete geo-rep logs from both master and slave.
> I have the log files, just checking to make sure there is no
> confidential info inside.  The logfiles are too big to send via email, 
> even
> when compressed.  Do you have a preferred method to allow me to share this
> data with you or would a share from my Google drive be sufficient?
>
> 3. Are the gluster versions same across master and slave?
> Yes, all gluster versions are the same across the two sites for all
> storage nodes.  See below for version info taken from the current geo-rep
> master.
>
> glusterfs 3.12.9
> Repository revision: git://git.gluster.org/glusterfs.git
> Copyright (c) 2006-2016 Red Hat, Inc. 
> GlusterFS comes with ABSOLUTELY NO WARRANTY.
> It is licensed to you under your choice of the GNU Lesser
> General Public License, version 3 or any later version (LGPLv3
> or later), or the GNU General Public License, version 2 (GPLv2),

Re: [Gluster-users] Geo-Replication memory leak on slave node

2018-06-20 Thread Mark Betham
Hi Kotresh,

I was wondering if you had made any progress with regards to the issue I am
currently experiencing with geo-replication.

For info the fault remains and effectively requires a restart of the
geo-replication service on a daily basis to reclaim the used memory on the
slave node.

If you require any further information then please do not hesitate to ask.

Many thanks,

Mark Betham


On Mon, 11 Jun 2018 at 08:24, Mark Betham <
mark.bet...@performancehorizon.com> wrote:

> Hi Kotresh,
>
> Many thanks.  I will shortly setup a share on my GDrive and send the link
> directly to yourself.
>
> For Info;
> The Geo-Rep slave failed again over the weekend but it did not recover
> this time.  It looks to have become unresponsive at around 14:40 UTC on 9th
> June.  I have attached an image showing the mem usage and you can see from
> this when the system failed.  The system was totally unresponsive and
> required a cold power off and then power on in order to recover the server.
>
> Many thanks for your help.
>
> Mark Betham.
>
> On 11 June 2018 at 05:53, Kotresh Hiremath Ravishankar <
> khire...@redhat.com> wrote:
>
>> Hi Mark,
>>
>> Google drive works for me.
>>
>> Thanks,
>> Kotresh HR
>>
>> On Fri, Jun 8, 2018 at 3:00 PM, Mark Betham <
>> mark.bet...@performancehorizon.com> wrote:
>>
>>> Hi Kotresh,
>>>
>>> The memory issue re-occurred again.  This is indicating it will occur
>>> around once a day.
>>>
>>> Again no traceback listed in the log, the only update in the log was as
>>> follows;
>>> [2018-06-08 08:26:43.404261] I [resource(slave):1020:service_loop]
>>> GLUSTER: connection inactive, stopping timeout=120
>>> [2018-06-08 08:29:19.357615] I [syncdutils(slave):271:finalize] :
>>> exiting.
>>> [2018-06-08 08:31:02.432002] I [resource(slave):1502:connect] GLUSTER:
>>> Mounting gluster volume locally...
>>> [2018-06-08 08:31:03.716967] I [resource(slave):1515:connect] GLUSTER:
>>> Mounted gluster volume duration=1.2729
>>> [2018-06-08 08:31:03.717411] I [resource(slave):1012:service_loop]
>>> GLUSTER: slave listening
>>>
>>> I have attached an image showing the latest memory usage pattern.
>>>
>>> Can you please advise how I can pass the log data across to you?  As
>>> soon as I know this I will get the data uploaded for your review.
>>>
>>> Thanks,
>>>
>>> Mark Betham
>>>
>>>
>>>
>>>
>>> On 7 June 2018 at 08:19, Mark Betham >> > wrote:
>>>
 Hi Kotresh,

 Many thanks for your prompt response.

 Below are my responses to your questions;

 1. Is this trace back consistently hit? I just wanted to confirm
 whether it's transient which occurs once in a while and gets back to 
 normal?
 It appears not.  As soon as the geo-rep recovered yesterday from the
 high memory usage it immediately began rising again until it consumed all
 of the available ram.  But this time nothing was committed to the log file.
 I would like to add here that this current instance of geo-rep was only
 brought online at the start of this week due to the issues with glibc on
 CentOS 7.5.  This is the first time I have had geo-rep running with Gluster
 ver 3.12.9, both storage clusters at each physical site were only rebuilt
 approx. 4 weeks ago, due to the previous version in use going EOL.  Prior
 to this I had been running 3.13.2 (3.13.X now EOL) at each of the sites and
 it is worth noting that the same behaviour was also seen on this version of
 Gluster, unfortunately I do not have any of the log data from then but I do
 not recall seeing any instances of the trace back message mentioned.

 2. Please upload the complete geo-rep logs from both master and slave.
 I have the log files, just checking to make sure there is no
 confidential info inside.  The logfiles are too big to send via email, even
 when compressed.  Do you have a preferred method to allow me to share this
 data with you or would a share from my Google drive be sufficient?

 3. Are the gluster versions same across master and slave?
 Yes, all gluster versions are the same across the two sites for all
 storage nodes.  See below for version info taken from the current geo-rep
 master.

 glusterfs 3.12.9
 Repository revision: git://git.gluster.org/glusterfs.git
 Copyright (c) 2006-2016 Red Hat, Inc. 
 GlusterFS comes with ABSOLUTELY NO WARRANTY.
 It is licensed to you under your choice of the GNU Lesser
 General Public License, version 3 or any later version (LGPLv3
 or later), or the GNU General Public License, version 2 (GPLv2),
 in all cases as published by the Free Software Foundation.

 glusterfs-geo-replication-3.12.9-1.el7.x86_64
 glusterfs-gnfs-3.12.9-1.el7.x86_64
 glusterfs-libs-3.12.9-1.el7.x86_64
 glusterfs-server-3.12.9-1.el7.x86_64
 glusterfs-3.12.9-1.el7.x86_64
 glusterfs-api-3.12.9-1.el7.x86_64
 glusterfs-events-3.12.9-1.el7.x86_64