Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-20 Thread Nithya Balachandran
Thank you. In the meantime, turning off parallel readdir should prevent the first crash. On 20 June 2018 at 21:42, mohammad kashif wrote: > Hi Nithya > > Thanks for the bug report. This new crash happened only once and only at > one client in the last 6 days. I will let you know if it happened

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-20 Thread mohammad kashif
Hi Nithya Thanks for the bug report. This new crash happened only once and only at one client in the last 6 days. I will let you know if it happened again or more frequently. Cheers Kashif On Wed, Jun 20, 2018 at 12:28 PM, Nithya Balachandran wrote: > Hi Mohammad, > > This is a different

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-20 Thread Nithya Balachandran
Hi Mohammad, This is a different crash. How often does it happen? We have managed to reproduce the first crash you reported and a bug has been filed at [1]. We will work on a fix for this. Regards, Nithya [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593199 On 18 June 2018 at 14:09,

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-18 Thread mohammad kashif
Hi Problem appeared again after few days. This time, the client is glusterfs-3.10.12-1.el6.x86_64 and performance.parallel-readdir is off. The log level was set to ERROR and I got this log at the time of crash [2018-06-14 08:45:43.551384] E [rpc-clnt.c:365:saved_frames_unwind] (-->

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-17 Thread Raghavendra Gowdappa
On Mon, Jun 18, 2018 at 9:39 AM, Raghavendra Gowdappa wrote: > > > On Mon, Jun 18, 2018 at 8:11 AM, Raghavendra Gowdappa > wrote: > >> From the bt: >> >> #8 0x7f6ef977e6de in rda_readdirp (frame=0x7f6eec862320, >> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=357, off=2, >>

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-17 Thread Raghavendra Gowdappa
On Mon, Jun 18, 2018 at 8:11 AM, Raghavendra Gowdappa wrote: > From the bt: > > #8 0x7f6ef977e6de in rda_readdirp (frame=0x7f6eec862320, > this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=357, off=2, > xdata=0x7f6eec0085a0) at readdir-ahead.c:266 > #9 0x7f6ef952db4c in dht_readdirp_cbk

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-17 Thread Raghavendra Gowdappa
On Mon, Jun 18, 2018 at 8:11 AM, Raghavendra Gowdappa wrote: > From the bt: > > #8 0x7f6ef977e6de in rda_readdirp (frame=0x7f6eec862320, > this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=357, off=2, > xdata=0x7f6eec0085a0) at readdir-ahead.c:266 > #9 0x7f6ef952db4c in dht_readdirp_cbk

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-17 Thread Raghavendra Gowdappa
>From the bt: #8 0x7f6ef977e6de in rda_readdirp (frame=0x7f6eec862320, this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=357, off=2, xdata=0x7f6eec0085a0) at readdir-ahead.c:266 #9 0x7f6ef952db4c in dht_readdirp_cbk (frame=, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-17 Thread mohammad kashif
Hi Nithya Fuse volfiles is here after disabling parallel-readdir http://www-pnp.physics.ox.ac.uk/~mohammad/atlasglust.tcp-fuse.vol a Unfortunately I can't take risk of enabling parallel-readdir as the cluster is in heavy use and likely to kill many jobs if clients unmounted again. There is one

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-15 Thread Nithya Balachandran
On 15 June 2018 at 13:45, Nithya Balachandran wrote: > Hi Mohammad, > > I was unable to reproduce this on a volume created on a system running > 3.12.9. > > Can you send me the FUSE volfiles for the volume atlasglust? They will be > in /var/lib/glusterd/vols/atlasglust/ on any of the gluster

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-15 Thread Nithya Balachandran
Hi Mohammad, I was unable to reproduce this on a volume created on a system running 3.12.9. Can you send me the FUSE volfiles for the volume atlasglust? They will be in /var/lib/glusterd/vols/atlasglust/ on any of the gluster servers hosting the volume and called *.tcp-fuse.vol. Thanks,

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-14 Thread mohammad kashif
Hi Nithya It seems that problem can be solved by either turning parallel-readir off or downgrading client to 3.10.12-1 . Yesterday I downgraded some clients to 3.10.12-1 and it seems to fixed the problem. Today when I saw your email then I disabled parallel-readir off and the current client

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-13 Thread Nithya Balachandran
+Poornima who works on parallel-readdir. @Poornima, Have you seen anything like this before? On 14 June 2018 at 10:07, Nithya Balachandran wrote: > This is not the same issue as the one you are referring - that was in the > RPC layer and caused the bricks to crash. This one is different as it

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-13 Thread Nithya Balachandran
This is not the same issue as the one you are referring - that was in the RPC layer and caused the bricks to crash. This one is different as it seems to be in the dht and rda layers. It does look like a stack overflow though. @Mohammad, Please send the following information: 1. gluster volume

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-13 Thread Milind Changire
+Nithya Nithya, Do these logs [1] look similar to the recursive readdir() issue that you encountered just a while back ? i.e. recursive readdir() response definition in the XDR [1] http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log On Wed, Jun 13, 2018 at 4:29 PM, mohammad kashif wrote:

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-13 Thread mohammad kashif
Hi Milind Thanks a lot, I manage to run gdb and produced traceback as well. Its here http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log I am trying to understand but still not able to make sense out of it. Thanks Kashif On Wed, Jun 13, 2018 at 11:34 AM, Milind Changire wrote: >

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-13 Thread Milind Changire
Kashif, FYI: http://debuginfo.centos.org/centos/6/storage/x86_64/ On Wed, Jun 13, 2018 at 3:21 PM, mohammad kashif wrote: > Hi Milind > > There is no glusterfs-debuginfo available for gluster-3.12 from > http://mirror.centos.org/centos/6/storage/x86_64/gluster-3.12/ repo. Do > you know from

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-13 Thread mohammad kashif
Hi Milind There is no glusterfs-debuginfo available for gluster-3.12 from http://mirror.centos.org/centos/6/storage/x86_64/gluster-3.12/ repo. Do you know from where I can get it? Also when I run gdb, it says Missing separate debuginfos, use: debuginfo-install glusterfs-fuse-3.12.9-1.el6.x86_64

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-12 Thread mohammad kashif
Hi Milind I will send you links for logs. I collected these core dumps at client and there is no glusterd process running on client. Kashif On Tue, Jun 12, 2018 at 4:14 PM, Milind Changire wrote: > Kashif, > Could you also send over the client/mount log file as Vijay suggested ? > Or maybe

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-12 Thread mohammad kashif
Hi Vijay I have enabled TRACE for client and there are lots of Trace messages in log but no 'crash' The only error I can see is about inode context is NULL [io-cache.c:564:ioc_open_cbk] 0-atlasglust-io-cache: inode context is NULL (748157d2-274f-4595-9bb6-afb1fb5a0642) [Invalid argument]

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-12 Thread Milind Changire
Kashif, Could you also send over the client/mount log file as Vijay suggested ? Or maybe the lines with the crash backtrace lines Also, you've mentioned that you straced glusterd, but when you ran gdb, you ran it over /usr/sbin/glusterfs On Tue, Jun 12, 2018 at 8:19 PM, Vijay Bellur wrote: >

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-12 Thread Vijay Bellur
On Tue, Jun 12, 2018 at 7:40 AM, mohammad kashif wrote: > Hi Milind > > The operating system is Scientific Linux 6 which is based on RHEL6. The > cpu arch is Intel x86_64. > > I will send you a separate email with link to core dump. > You could also grep for crash in the client log file and

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-12 Thread mohammad kashif
Hi Milind The operating system is Scientific Linux 6 which is based on RHEL6. The cpu arch is Intel x86_64. I will send you a separate email with link to core dump. Thanks for your help. Kashif On Tue, Jun 12, 2018 at 3:16 PM, Milind Changire wrote: > Kashif, > Could you share the core

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-12 Thread Milind Changire
Kashif, Could you share the core dump via Google Drive or something similar Also, let me know the CPU arch and OS Distribution on which you are running gluster. If you've installed the glusterfs-debuginfo package, you'll also get the source lines in the backtrace via gdb On Tue, Jun 12, 2018

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-12 Thread mohammad kashif
Hi Milind, Vijay Thanks, I have some more information now as I straced glusterd on client 138544 0.000131 mprotect(0x7f2f70785000, 4096, PROT_READ|PROT_WRITE) = 0 <0.26> 138544 0.000128 mprotect(0x7f2f70786000, 4096, PROT_READ|PROT_WRITE) = 0 <0.27> 138544 0.000126

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-12 Thread Milind Changire
Kashif, You can change the log level by: $ gluster volume set diagnostics.brick-log-level TRACE $ gluster volume set diagnostics.client-log-level TRACE and see how things fare If you want fewer logs you can change the log-level to DEBUG instead of TRACE. On Tue, Jun 12, 2018 at 3:37 PM,

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-12 Thread mohammad kashif
Hi Vijay Now it is unmounting every 30 mins ! The server log at /var/log/glusterfs/bricks/glusteratlas-brics001-gv0.log have this line only 2018-06-12 09:53:19.303102] I [MSGID: 115013] [server-helpers.c:289:do_fd_cleanup] 0-atlasglust-server: fd cleanup on

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-11 Thread Vijay Bellur
On Mon, Jun 11, 2018 at 8:50 AM, mohammad kashif wrote: > Hi > > Since I have updated our gluster server and client to latest version > 3.12.9-1, I am having this issue of gluster getting unmounted from client > very regularly. It was not a problem before update. > > Its a distributed file

[Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

2018-06-11 Thread mohammad kashif
Hi Since I have updated our gluster server and client to latest version 3.12.9-1, I am having this issue of gluster getting unmounted from client very regularly. It was not a problem before update. Its a distributed file system with no replication. We have seven servers totaling around 480TB