Greetings list!
I am having trouble keeping gmetad stable. I am running ganglia-3.1.1
on AIX 5.3 TL6 SP4 which I have compiled from source. It was compiled
using gcc and linked with IBM's ld. The configure line was:
./configure --prefix=$WHERE --enable-shared=yes --enable-static=no
--with-gmetad --disable-shared
The build seemed to work fine. I have current versions of rrdtool,
gcc, gnu make/sed, and libconfuse. I configured gmond and started it
up.
I created a configuration file for gmetad based on the sample provided
in the source distribution. I created a
/opt/local/var/lib/ganglia/rrds/unspecified directory owned by the
'nobody' user with mode 755. However, when I try and run gmetad in
debug level 10, it dies with the following error (after a short
delay):
Going to run as user nobody
Sources are ...
Source: [my cluster, step 15] has 1 sources
127.0.0.1
xml listening on port 8651
interactive xml listening on port 8652
Data thread 1800 is monitoring [my cluster] data source
127.0.0.1
[my cluster] is a 2.5 or later data stream
hash_create size = 1024
hash->size is 1031
hash_create size = 50
hash->size is 53
hash_create size = 50
hash->size is 53
Updating host optaixadmin01.cswg.com, metric load_one
cleanup thread has been started
RRD_create: msync rrd_file: Invalid argument
[my cluster] is a 2.5 or later data stream
Updating host hostname.domainname.tld, metric load_one
Unable to mkdir(/opt/local/var/lib/ganglia/rrds/unspecified): Error 0
If I try to start it again, it dies again right away with a different
error (until I remove /opt/local/var/lib/ganglia/rrds/unspecified):
Going to run as user nobody
Sources are ...
Source: [my cluster, step 15] has 1 sources
127.0.0.1
xml listening on port 8651
interactive xml listening on port 8652
Data thread 1800 is monitoring [my cluster] data source
127.0.0.1
cleanup thread has been started
[my cluster] is a 2.5 or later data stream
hash_create size = 1024
hash->size is 1031
hash_create size = 50
hash->size is 53
hash_create size = 50
hash->size is 53
Updating host hostname.domainname.tld, metric load_one
Unable to mkdir(/opt/local/var/lib/ganglia/rrds/unspecified): Error 0
If I stop gmond, remove /opt/local/var/lib/ganglia/rrds/unspecified,
and start only gmetad, the process is stable, but I don't get any
local statistics.
A few minutes after I start gmond again, gmetad dies.
Am I using this wrong or is it currently unstable on AIX? Isn't it
possible to gmond on a gmetad server?
Thanks in advance for any pointers you can provide.
------------------------------------------------------------------------------
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general