2010/2/22 Samuel Hassine <samuel.hass...@gmail.com>:
> I'm also looking for a way to monitor gluster nodes.
> Any solutions ?
> Le lundi 22 février 2010 à 10:12 +0500, Anton a écrit :
>> Hello!
>> I'm looking for the way to determine the health of the GLUSTER
>> cluster. Is there any way to determine if any of the nodes failed? In
>> the log files it is possible to grep that there is "remotexx:
>> disconnected" - but it is not sutable for monitoring. There should be
>> the simple way to just query the cluster against the .vol file and
>> see, if any node/brick failed to attach and so trigger the alarm. Is
>> there anything like "gluster --reporthealth"?

Checking if a connection to the GlusterFS TCP server port (6996 IIRC)
is possible might be an indicator for working/failing - at least for
setups that use TCP. I don't know if anything like that is possible
for Infiniband-only setups.

IIRC, Nagios can check if a port is open on a remote machine. That
won't find something like disk/filesystem problems on the server, but
it could report crashed GlusterFS server processes and machines that
are not working at all.

I know that this simple method won't provide a positive status (=it
works) which would be preferable, but at least it can provide a
negative status (=_something_ failed on _that_ machine) in some cases.

IIRC, some time ago someone requested a syslog feature to debug
problems with GlusterFS as root filesystem for a diskless cluster -
are there any news on that?
Having the clients report problems to a central logging server might
be useful for monitoring.



