Greetings list!
I was only able to get ganglia to function on my system by patching
gmetad/rrd_helpers.c, changing err_sys to err_msg for this mkdir()
(patch attached).

I haven't a clue why gmetad insists on repeatedly trying to create
this directory on my system, but making this failure more benign seems
to work okay... my system is now collecting stats.

I did run this through truss, and it is showing an EEXIST in the truss
output. So how the errno != EEXIST portion of this test is passing, I
don't know.

Am I the first one to experience this problem? Is there a better workaround?

Thanks for any advice anyone can provide.

On Tue, Dec 23, 2008 at 4:20 PM, Ben Lentz <[email protected]> wrote:
> Greetings list!
> I am having trouble keeping gmetad stable. I am running ganglia-3.1.1
> on AIX 5.3 TL6 SP4 which I have compiled from source. It was compiled
> using gcc and linked with IBM's ld. The configure line was:
>
> ./configure --prefix=$WHERE --enable-shared=yes --enable-static=no
> --with-gmetad --disable-shared
>
> The build seemed to work fine. I have current versions of rrdtool,
> gcc, gnu make/sed, and libconfuse. I configured gmond and started it
> up.
>
> I created a configuration file for gmetad based on the sample provided
> in the source distribution. I created a
> /opt/local/var/lib/ganglia/rrds/unspecified directory owned by the
> 'nobody' user with mode 755. However, when I try and run gmetad in
> debug level 10, it dies with the following error (after a short
> delay):
>
> Going to run as user nobody
> Sources are ...
> Source: [my cluster, step 15] has 1 sources
>        127.0.0.1
> xml listening on port 8651
> interactive xml listening on port 8652
> Data thread 1800 is monitoring [my cluster] data source
>        127.0.0.1
> [my cluster] is a 2.5 or later data stream
> hash_create size = 1024
> hash->size is 1031
> hash_create size = 50
> hash->size is 53
> hash_create size = 50
> hash->size is 53
> Updating host optaixadmin01.cswg.com, metric load_one
> cleanup thread has been started
> RRD_create: msync rrd_file: Invalid argument
> [my cluster] is a 2.5 or later data stream
> Updating host hostname.domainname.tld, metric load_one
> Unable to mkdir(/opt/local/var/lib/ganglia/rrds/unspecified): Error 0
>
> If I try to start it again, it dies again right away with a different
> error (until I remove /opt/local/var/lib/ganglia/rrds/unspecified):
> Going to run as user nobody
> Sources are ...
> Source: [my cluster, step 15] has 1 sources
>        127.0.0.1
> xml listening on port 8651
> interactive xml listening on port 8652
> Data thread 1800 is monitoring [my cluster] data source
>        127.0.0.1
> cleanup thread has been started
> [my cluster] is a 2.5 or later data stream
> hash_create size = 1024
> hash->size is 1031
> hash_create size = 50
> hash->size is 53
> hash_create size = 50
> hash->size is 53
> Updating host hostname.domainname.tld, metric load_one
> Unable to mkdir(/opt/local/var/lib/ganglia/rrds/unspecified): Error 0
>
> If I stop gmond, remove /opt/local/var/lib/ganglia/rrds/unspecified,
> and start only gmetad, the process is stable, but I don't get any
> local statistics.
>
> A few minutes after I start gmond again, gmetad dies.
>
> Am I using this wrong or is it currently unstable on AIX? Isn't it
> possible to gmond on a gmetad server?
>
> Thanks in advance for any pointers you can provide.
>

Attachment: ganglia.patch
Description: Binary data

------------------------------------------------------------------------------
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to