ok, let's see if we can gather more info. I am not a specialist, but you know... another pair of eyes.
My system has a single glusterd process and it has a pretty low PID, meaning it has not crashed. What is your PID for your glusterd ? how many zombie processes are there reported by top ? I've been running my preliminary tests with gluster for a little over a month now and have never seen this. My platform is CentOS 6.5, so, I'd say it is pretty similar. >From my perspective, even making gluster sweat, running some intense rsync jobs in parallel, and seeing glusterd AND glusterfs take 120% of processing time on top (each on one core), they never crashed. My zombie count, from top, is zero. On the other hand, I had one of my nodes, the other day, crashing a process every time I started a high demanding task. Ends up I had (and still have) a hardware problem on one of the processor (or the main board; still undiagnosed). Do you have this problem on one node only ? Any chance you have something special compiled on your kernel ? Any particularly memory-hungry tweak on your sysctl ? Sounds like the system, not gluster. KR, Carlos On Fri, Mar 21, 2014 at 10:29 PM, Steve Thomas < [email protected]> wrote: > Hi all... > > Further investigation shows in excess of 500 glusterd zombie processes and > continuing to climb on the box ... > > Any suggestions? Am happy to provide logs etc to get to the bottom of > this.... > > _____________________________________________ > *From:* Steve Thomas > *Sent:* 21 March 2014 13:21 > *To:* '[email protected]' > *Subject:* Gluster 3.4.2 on Redhat 6.5 > > > Hi, > > I'm running Gluster 3.4.2 on Redhat 6.5 with 4 servers with a brick on > each. This brick is mounted locally and used by apache to server audio > files for an IVR system. Each of these audio files are typically around > 80-100Kb. > > System appears to be working ok in terms of health and status via gluster > CLI. > > The system is monitored by nagios and there's a check for zombie processes > and the gluster status. It appears that over a 24 hour period the number of > Zombie processes on the box has increased and is continually increasing. > Investigating these are "glusterd" processes. > > I'm making an assumption but I'd suspect that the regular nagios checks > are resulting in the increase in zombie processes as they are querying the > glusterd process. The command that the nagios plugin is running is: > > #Check heal status > gluster volume heal audio info > > #Check volume status > gluster volume status audio detail > > Does anyone have any suggestions as to why glusterd is resulting in these > zombie processes? > > Thanks for help in advance, > > Steve > > > > _______________________________________________ > Gluster-users mailing list > [email protected] > http://supercolony.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list [email protected] http://supercolony.gluster.org/mailman/listinfo/gluster-users
