Hey David:

Are the downed nodes always the same or are they sort of random?

Can you check /var/log/messages on the nodes and see if there are any
clues to why Ganglia is reporting them as down?

Cheers,

Bernard 

> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Dr. David F. Robinson
> Sent: Saturday, January 15, 2005 7:04
> To: [email protected]; 
> [email protected]
> Subject: [Oscar-users] ganglia
> 
> 
> Ganglia is reporting nodes 121-140 of my 140 node system as 
> down.  If I do a
> 
> cexec '/etc/init.d/gmond restart' all of the nodes show up as 
> available.
> However, after an hour or two these nodes go back to a 'down' state.  
> 
> They do not show up under a 'pbsnodes -l' command and they 
> are working fine.
> I can submit and run jobs on these nodes.
> 
> Any suggestions?
> 
> Thanks in advance,
> 
> David
> 
> 
> 
> 
> 
> 
> -------------------------------------------------------
> The SF.Net email is sponsored by: Beat the post-holiday blues 
> Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
> It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
> _______________________________________________
> Oscar-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/oscar-users
> 


-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to