Re: [Gluster-devel] glustershd status

2014-07-17 Thread Emmanuel Dreyfus
On Wed, Jul 16, 2014 at 09:54:49PM -0700, Harshavardhana wrote: In-fact this is true on Linux as well - there is smaller time window observe the below output , immediately run 'volume status' after a 'volume start' event I observe the same lapse on NetBSD if the volume is created and started.

Re: [Gluster-devel] glustershd status

2014-07-17 Thread Harshavardhana
On a side note while looking into this issue - I uncovered a memory leak too which after successful registration with glusterd, Self-heal daemon and NFS server are killed by FreeBSD memory manager. Have you observed any memory leaks? I have the valgrind output and it clearly indicates of large

Re: [Gluster-devel] glustershd status

2014-07-17 Thread Krishnan Parthasarathi
Emmanuel, Could you take statedump* of the glustershd process when it has leaked enough memory to be able to observe and share the output? This might give us what kind of objects are we allocating abnormally high. * statedump of a glusterfs process #kill -USR1 pid of process HTH, Krish

Re: [Gluster-devel] glustershd status

2014-07-17 Thread Harshavardhana
KP, I do have a 3.2Gigs worth of valgrind output which indicates this issue, trying to reproduce this on Linux. My hunch says that 'compiling' with --disable-epoll might actually trigger this issue on Linux too. Will update here once i have done that testing. On Wed, Jul 16, 2014 at 11:44 PM,

Re: [Gluster-devel] glustershd status

2014-07-17 Thread Krishnan Parthasarathi
Harsha, In addition to the valgrind output, statedump output of glustershd process when the leak is observed would be really helpful. thanks, Krish - Original Message - Nope spoke too early, using poll() has no effect on the memory usage on Linux, so actually back to FreeBSD. On

Re: [Gluster-devel] glustershd status

2014-07-17 Thread Harshavardhana
This is a small memory system like 1024M and a disk space for the volume is 9gig, i do not think it has anything to do with AFR per se - same bug is also reproducible on the bricks, nfs server too. Also it might be that we aren't able to capture glusterdumps on non Linux platforms properly - one

Re: [Gluster-devel] glustershd status

2014-07-17 Thread Krishnan Parthasarathi
Harsha, I haven't gotten around looking at the valgrind output. I am not sure if I will be able to do it soon since I am travelling next week. Are you seeing an equal no. of disconnect messages in glusterd logs? What is the ip:port you observe in the RPC_CLNT_CONNECT messages? Could you attach

Re: [Gluster-devel] glustershd status

2014-07-17 Thread Harshavardhana
Sure will do that! - if i get any clues i might send out a patch :-) On Thu, Jul 17, 2014 at 9:05 PM, Krishnan Parthasarathi kpart...@redhat.com wrote: Harsha, I haven't gotten around looking at the valgrind output. I am not sure if I will be able to do it soon since I am travelling next

Re: [Gluster-devel] glustershd status

2014-07-16 Thread Emmanuel Dreyfus
Harshavardhana har...@harshavardhana.net wrote: Its pretty much the same on FreeBSD, i didn't spend much time debugging it. Let me do it right away and let you know what i find. Right. Once you will have this one, I have Linux-specific truncate and md5csum replacements to contribute. I do not

Re: [Gluster-devel] glustershd status

2014-07-16 Thread Harshavardhana
So here is what i found long email please bare with me Looks like the management daemon and these other daemons eg: brick, nfs-server, gluster self-heal daemon They work in non-blocking manner, as in notifying back to Gluster management daemon when they are available and when they are not. This