Hopefully I have ran all avenues on trying to trouble shoot this problem.
We have an IBM 1350 cluster running RH 7.3 with a custom kernel.  I am
running gmetad 2.5.2 and gmond 2.5.2.  We recently had a node go south when
there was an attempt to upgrade xcat.  The problem node was re-imaged and
gmond was reinstalled.  When I grep the status of the monitor on the node
it appears to be running.  I have nmap the port for gmond and it is in an
open state.  I have restarted the gmetad and gmond on all of the nodes.
The /var/lib/ganglia/rrds/<node directory> appears to have the old data
files from when the node crashed.  There is no firewall setting prohibiting
access to the node.  All of the other nodes are reporting fine.  From my
investigation, every .conf file is configured to the default settings.
Anyone have any additional ideas as to were to start with debugging this
node that no longer reports?  Any direction would greatly be appreciated.


Ed Helwig
Linux/Unix Contractor
Honda HRAO

"I am so jealous your Windows viruses don't run on my Linux desktop."



**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.
**********************************************************************


Reply via email to