Probably a stale lockfile somewhere in /var/lock/subsys . Try '/etc/init.d/gmond zap' and then restart it.
- Marcel

Foster, Scott (MS) wrote:
Thank you all for your help.  It is working now, but of course I'm trying to 
make one more tweak to get it exactly as we want it.  So I'm not sure if the 
following can be done.

I am trying to get the headnode and other servers to show up in the server 
group, but since the headnode is running gmetad when I try to launch gmond I 
get a error message:  gmond dead subsys locked.

I'm not sure what that means.  Basically the server group doesn't show up in 
ganglia like all the other groups.

Scott

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Bernard
Li
Sent: Tuesday, November 02, 2004 11:28 AM
To: Foster, Scott (MS); Sean Dilda
Cc: [email protected]
Subject: RE: [Ganglia-general] Multiple clusters single gmetad


Hi:

You would only need to list them ALL for redundancy - if you only list
one node in the data_source, then if that node dies for whatever reason
then gmetad will not be able to collect data from that subcluster.  So I
guess having at least 2 nodes is good for a large subcluster.

I don't think anybody in their right mind would list all 1000 nodes in
data_source ;-)

Good luck.

Cheers,

Bernard

-----Original Message-----
From: Foster, Scott (MS) [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 02, 2004 10:49
To: Bernard Li; Sean Dilda
Cc: [email protected]
Subject: RE: [Ganglia-general] Multiple clusters single gmetad

Ah that makes sense.

But if I'm running gmond on each node in the subcluster do I need to specify each of them as datasources? I'm not sure that makes sense especially if I have 1000 nodes each running gmond. I know I'm missing something here and I appreciate all your continued help.

Scott

-----Original Message-----
From: Bernard Li [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 02, 2004 10:09 AM
To: Foster, Scott (MS); Sean Dilda
Cc: [email protected]
Subject: RE: [Ganglia-general] Multiple clusters single gmetad


Hi Scott:

The different ip addresses showing in gmetad.conf is not the ip address of the server running gmetad but instead one of the node (the headnode of that subgroup, if you may), which runs gmond.

Check the syntax of the data_source tag, the ip/hostname listed there refers to one or all the nodes in that subgroup/datasource.

Cheers,

Bernard

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Foster, Scott (MS)
Sent: Tuesday, November 02, 2004 9:14
To: Sean Dilda
Cc: [email protected]
Subject: RE: [Ganglia-general] Multiple clusters single gmetad

How does that work with the gmetad having 3 different IP

addresses?
My goal was to have one server running gmetad, rrd, and

apache. Do I
have to have gmetad running on another system to break my

cluster into
sub-clusters?

Thanks for the earlier reply.  I appreciate it.

Scott

-----Original Message-----
From: Sean Dilda [mailto:[EMAIL PROTECTED]
Sent: Monday, November 01, 2004 5:14 PM
To: Foster, Scott (MS)
Cc: [email protected]
Subject: Re: [Ganglia-general] Multiple clusters single gmetad


On Mon, 2004-11-01 at 17:47, Foster, Scott (MS) wrote:

I've successfully setup ganglia (thanks to everyone on this

list), but now I want to try and break out our cluster into smaller sub clusters.

1) Production Nodes
2) Remote Nodes
3) Test Nodes

I'm trying to do all this with one server running gmetad

for all the nodes in the cluster regardless of their

different status.
Is this even possible?

I've been playing with different multicast addresses and

xml addresses and haven't had much success. Has anyone ever tried this before?

Yes, this is possible, I'm currently running ganglia like this. I have a different gmond.conf for each subcluster. The only thing that changes between the gmond.conf's is the

"name" and
"mcast_channel", all other settings are the same.
As for the gmetad.conf, here's the crucial part of it:

data_source "Head Cluster" 300 10.10.1.19 data_source "Storage" 300 10.10.1.1 data_source "Monitor" 300 localhost


Reply via email to