I'm running gmetad v2.5.7 on Solaris 8 and it randomly falls over with a
mkdir error():
deputy.nyc:/var/adm > grep gmetad messages messages.0
messages:Oct 25 04:25:29 deputy.nyc.deshaw.com
/usr/local/ganglia/sbin/gmetad[15969]: [ID 937774 user.error] Unable to
mkdir(/var/ganglia/rrds/__SummaryInfo__): Error 0
messages:Oct 25 04:25:29 deputy.nyc.deshaw.com
/usr/local/ganglia/sbin/gmetad[15969]: [ID 937774 user.error] Unable to
mkdir(/var/ganglia/rrds/__SummaryInfo__): Error 0
messages.0:Oct 23 02:57:32 deputy.nyc.deshaw.com
/usr/local/ganglia/sbin/gmetad[29731]: [ID 937774 user.error] Unable to
mkdir(/var/ganglia/rrds/__SummaryInfo__): Error 0
messages.0:Oct 23 02:57:32 deputy.nyc.deshaw.com
/usr/local/ganglia/sbin/gmetad[29731]: [ID 937774 user.error] Unable to
mkdir(/var/ganglia/rrds/__SummaryInfo__): Error 0
The permissions on /var/ganglia are definitely correct. Notice the
error text is "Error 0" which implies that errno is 0. Here's the
relevant code fragment (gmetad/rrd_helpers.c):
static void inline
my_mkdir ( const char *dir )
{
if ( mkdir ( dir, 0755 ) < 0 && errno != EEXIST)
{
err_sys("Unable to mkdir(%s)",dir);
}
}
I thought err_sys() might be mangling errno; however, some quick testing
shows that err_sys() is working fine. Thus, it seems that mkdir returns
< 0 but errno is really 0! Could this be some sort of weird Solaris
thread interaction with mkdir()? Any other ideas?
BTW - I originally ran gmetad on Red Hat Enterprise release 3; however,
the system eventually ground to a halt. I'm attributing it to this
RedHat bug: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=121434
(still unresolved). Soon enough I'll be out of possible platforms for
gmetad ;-). David