[Ganglia-general] gmetad segfaults after running for a while (on AWS EC2)

2014-09-11 Thread Sam Barham
We are using Ganglia to monitoring our cloud infrastructure on Amazon AWS. Everything is working correctly (metrics are flowing etc), except that occasionally the gmetad process will segfault out of the blue. The gmetad process is running on an m3.medium EC2, and is monitoring about 50 servers.

Re: [Ganglia-general] gmetad segfaults after running for a while (on AWS EC2)

2014-09-14 Thread Sam Barham
, Sep 12, 2014 at 10:11 AM, Devon H. O'Dell devon.od...@gmail.com wrote: Are you able to share a core file? 2014-09-11 14:32 GMT-07:00 Sam Barham s.bar...@adinstruments.com: We are using Ganglia to monitoring our cloud infrastructure on Amazon AWS. Everything is working correctly (metrics

[Ganglia-general] Help understanding tmax and dmax

2014-09-15 Thread Sam Barham
I'm having trouble understanding what values to use for dmax and tmax in my gmetric calls, and how those values match up to actual behaviour. The situation is that I have several cron scripts that each run once a minute, finding various custom metrics and passing them into ganglia. I then have

Re: [Ganglia-general] gmetad segfaults after running for a while (on AWS EC2)

2014-09-21 Thread Sam Barham
The debug build of 3.6.0 finally crashed over the weekend. The backtrace is: #0 0x7f042e4ba38c in hash_insert (key=0x7f0425bcc440, val=0x7f0425bcc430, hash=0x7239d0) at hash.c:233 #1 0x00408551 in startElement_METRIC (data=0x7f0425bcc770, el=0x733930 METRIC, attr=0x709270) at

[Ganglia-general] gmond occasionally doesn't connect up in unicast

2014-11-12 Thread Sam Barham
We've got about 100 machines running on AWS EC2s, with Ganglia for monitoring. Because we are on Amazon, we can't use multicast, so the architecture we have is each cluster has a Bastion machine, and each other machine in the cluster has gmond send its' data to the bastion, which gmetad then

Re: [Ganglia-general] gmond occasionally doesn't connect up in unicast

2014-11-12 Thread Sam Barham
, fruitlessly trying to multicast into the void. Good luck! On Wed, Nov 12, 2014 at 2:41 PM, Sam Barham s.bar...@adinstruments.com wrote: We've got about 100 machines running on AWS EC2s, with Ganglia for monitoring. Because we are on Amazon, we can't use multicast, so the architecture we have

Re: [Ganglia-general] segfault on gmetad making Ganglia unusable.

2015-02-08 Thread Sam Barham
I can't help unfortunately, but I can say that I've been having exactly the same issue, although less frequent (crashes anything from several times a day to once every couple of days). What is your gmetad hosted on? Mine is on Amazon Debian EC2s. Cheers Sam On Sun, Feb 8, 2015 at 11:21 AM,