Jesse Becker <[EMAIL PROTECTED]> wrote: > On Fri, Oct 17, 2008 at 16:24, Ofer Inbar <[EMAIL PROTECTED]> wrote: > > Ganglia 3.1.0 on CentOS 4. > > Ganglia 3.1.1, Solaris 10, Sparc.
> I'm also seeing a blocked gmond, although my situation may be slightly > different. > > I checked that gmond was running on that host, and it was. > > However, attempts to connect to its port 8649 would indeed timeout. > > Same here. Gmond will run fine for a while, then fail to respond to > TCP connections. Running 'telnet localhost 8649' fails to connect. > In my case, "a while" ranges from minutes to hours--I've been testing > this off and on since yesterday. > > Restarting gmond on the aggregation host will fix the problem...for a while. > > Another important point is that gmond has *not* completely hung. > Running it under debug mode (-d5) shows that it is both collecting > metrics from the local system, and accepting metrics from the two > other hosts. The problem appears to be specifically with responding > to TCP connections. That does sound somewhat different. In my case, tracing the running gmond showed: # strace -p 16830 Process 16830 attached - interrupt to quit write(7, "<EXTRA_DATA>\n", 13 <unfinished ...> Process 16830 detached I had to ^C to get the <unfinished ...>, so when I watching it was just sitting there waiting for the write to finish. I think it was trying to write to a TCP socket, because the lsof sample I took a little bit later shows no file descriptor 7, but I *had* had an attempt to connect to its port 8649, which I broke off before the lsof. Also in my case, once I restarted it, it hasn't happened again. Doesn't mean it won't ever, but it at least isn't frequent. I have more than 40 gmonds running and have only seen one of them do this. -- Cos ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

