Hey David: Are the downed nodes always the same or are they sort of random?
Can you check /var/log/messages on the nodes and see if there are any clues to why Ganglia is reporting them as down? Cheers, Bernard > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Dr. David F. Robinson > Sent: Saturday, January 15, 2005 7:04 > To: [email protected]; > [email protected] > Subject: [Oscar-users] ganglia > > > Ganglia is reporting nodes 121-140 of my 140 node system as > down. If I do a > > cexec '/etc/init.d/gmond restart' all of the nodes show up as > available. > However, after an hour or two these nodes go back to a 'down' state. > > They do not show up under a 'pbsnodes -l' command and they > are working fine. > I can submit and run jobs on these nodes. > > Any suggestions? > > Thanks in advance, > > David > > > > > > > ------------------------------------------------------- > The SF.Net email is sponsored by: Beat the post-holiday blues > Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. > It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt > _______________________________________________ > Oscar-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/oscar-users > ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Oscar-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-users
