One of our servers encountered an I/O error that put its root
filesystem into read only mode.  Both /var and /tmp are on that
filesystem, so all logging stopped and most everything stopped.

However, gmond kept on running, and reporting metrics.  Great!
This is yet another way in which Ganglia wins over most other
monitoring systems that involve scripts that write things to disk or
otherwise depend on things (such as ssh logins) that need to write to
disk.

However, a program I have that feeds custom metrics to gmond via
gmetric stopped working when the / filesystem went read-only.  I
tried running it in debug mode, and got this error:

  /etc/ganglia/gmond.conf:94: failed to determine the temp dir
  Parse error for '/etc/ganglia/gmond.conf'

Line 94 of gmond.conf is:
  include ('/etc/ganglia/conf.d/*.conf') 

We've never had an /etc/ganglia/conf.d directory, it always ignores that.

I tried feeding one of my custom metrics by hand:
[root ~]$ gmetric --name net_smtp_fin_wait2_out --value 0 --type uint8 --units 
'connections'
/etc/ganglia/gmond.conf:94: failed to determine the temp dir
Parse error for '/etc/ganglia/gmond.conf'

Then, I cd'ed over to a filesystem that is still in read/write mode:
[root /otherfilesys]$ gmetric --name net_smtp_fin_wait2_out --value 0 --type 
uint8 --units 'connections'

No error, and it worked.

What's the dependency that causes gmetric to require that the
filesystem the CWD is on be writeable?  Does it really need that
dependency?  It's great that Ganglia is so robust in the face of
failures, but it'd be even better if gmetric were also as robust.
  -- Cos

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to