Hi Scott: 8651 is the default XML port, don't use it for the Multicast port!
Cheers, Bernard > -----Original Message----- > From: Foster, Scott (MS) [mailto:[EMAIL PROTECTED] > Sent: Tuesday, November 02, 2004 14:31 > To: Marcel Birthelmer > Cc: Bernard Li; Sean Dilda; [email protected] > Subject: RE: [Ganglia-general] Multiple clusters single gmetad > > Thanks for the help. I found the lock file in > /var/lock/subsys and deleted the gmond file. However I still > get the problem whenever I try to start gmond after gmetad is > started. I reversed the order and get the same error message > except now gmetad says it is dead. > > Here is a quick summary of my setup: > > Production Nodes - Multicast 8650 > Remote Nodes - Multicast 8651 > Experimental Nodes - Multicast 8652 > Servers - Multicast 8649 > > gmetad, apache, rrd, and gmond are setup to run on one > server. gmetad starts ok, gmond gives me "gmond dead, subsys > locked", rrd and apache work fine. > > My server is setup in its gmond.conf file to be part of the > "Servers" cluster. Utilizing mulicast port 8649 and I have > all trusted set to on. > > If I've left something out I apologize. I'm happy to post > more info if needed. > > Thanks in advance. > > Scott > > -----Original Message----- > From: Marcel Birthelmer [mailto:[EMAIL PROTECTED] > Sent: Tuesday, November 02, 2004 2:14 PM > To: Foster, Scott (MS) > Cc: Bernard Li; Sean Dilda; [email protected] > Subject: Re: [Ganglia-general] Multiple clusters single gmetad > > > Probably a stale lockfile somewhere in /var/lock/subsys . Try > '/etc/init.d/gmond zap' and then restart it. > - Marcel > > Foster, Scott (MS) wrote: > > Thank you all for your help. It is working now, but of > course I'm trying to make one more tweak to get it exactly as > we want it. So I'm not sure if the following can be done. > > > > I am trying to get the headnode and other servers to show > up in the server group, but since the headnode is running > gmetad when I try to launch gmond I get a error message: > gmond dead subsys locked. > > > > I'm not sure what that means. Basically the server group > doesn't show up in ganglia like all the other groups. > > > > Scott > > > > -----Original Message----- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] Behalf Of > > Bernard Li > > Sent: Tuesday, November 02, 2004 11:28 AM > > To: Foster, Scott (MS); Sean Dilda > > Cc: [email protected] > > Subject: RE: [Ganglia-general] Multiple clusters single gmetad > > > > > > Hi: > > > > You would only need to list them ALL for redundancy - if > you only list > > one node in the data_source, then if that node dies for whatever > > reason then gmetad will not be able to collect data from that > > subcluster. So I guess having at least 2 nodes is good for > a large subcluster. > > > > I don't think anybody in their right mind would list all > 1000 nodes in > > data_source ;-) > > > > Good luck. > > > > Cheers, > > > > Bernard > > > > > >>-----Original Message----- > >>From: Foster, Scott (MS) [mailto:[EMAIL PROTECTED] > >>Sent: Tuesday, November 02, 2004 10:49 > >>To: Bernard Li; Sean Dilda > >>Cc: [email protected] > >>Subject: RE: [Ganglia-general] Multiple clusters single gmetad > >> > >>Ah that makes sense. > >> > >>But if I'm running gmond on each node in the subcluster do > I need to > >>specify each of them as datasources? I'm not sure that makes sense > >>especially if I have 1000 nodes each running gmond. I know I'm > >>missing something here and I appreciate all your continued help. > >> > >>Scott > >> > >>-----Original Message----- > >>From: Bernard Li [mailto:[EMAIL PROTECTED] > >>Sent: Tuesday, November 02, 2004 10:09 AM > >>To: Foster, Scott (MS); Sean Dilda > >>Cc: [email protected] > >>Subject: RE: [Ganglia-general] Multiple clusters single gmetad > >> > >> > >>Hi Scott: > >> > >>The different ip addresses showing in gmetad.conf is not the ip > >>address of the server running gmetad but instead one of the > node (the > >>headnode of that subgroup, if you may), which runs gmond. > >> > >>Check the syntax of the data_source tag, the ip/hostname > listed there > >>refers to one or all the nodes in that subgroup/datasource. > >> > >>Cheers, > >> > >>Bernard > >> > >> > >>>-----Original Message----- > >>>From: [EMAIL PROTECTED] > >>>[mailto:[EMAIL PROTECTED] On Behalf Of > >>>Foster, Scott (MS) > >>>Sent: Tuesday, November 02, 2004 9:14 > >>>To: Sean Dilda > >>>Cc: [email protected] > >>>Subject: RE: [Ganglia-general] Multiple clusters single gmetad > >>> > >>>How does that work with the gmetad having 3 different IP > >> > >>addresses? > >> > >>>My goal was to have one server running gmetad, rrd, and > >> > >>apache. Do I > >> > >>>have to have gmetad running on another system to break my > >> > >>cluster into > >> > >>>sub-clusters? > >>> > >>>Thanks for the earlier reply. I appreciate it. > >>> > >>>Scott > >>> > >>>-----Original Message----- > >>>From: Sean Dilda [mailto:[EMAIL PROTECTED] > >>>Sent: Monday, November 01, 2004 5:14 PM > >>>To: Foster, Scott (MS) > >>>Cc: [email protected] > >>>Subject: Re: [Ganglia-general] Multiple clusters single gmetad > >>> > >>> > >>>On Mon, 2004-11-01 at 17:47, Foster, Scott (MS) wrote: > >>> > >>>>I've successfully setup ganglia (thanks to everyone on this > >>> > >>>list), but now I want to try and break out our cluster > into smaller > >>>sub clusters. > >>> > >>>>1) Production Nodes > >>>>2) Remote Nodes > >>>>3) Test Nodes > >>>> > >>>>I'm trying to do all this with one server running gmetad > >>> > >>>for all the nodes in the cluster regardless of their > >> > >>different status. > >> > >>>Is this even possible? > >>> > >>>>I've been playing with different multicast addresses and > >>> > >>>xml addresses and haven't had much success. Has anyone ever tried > >>>this before? > >>> > >>>Yes, this is possible, I'm currently running ganglia like this. I > >>>have a different gmond.conf for each subcluster. > >>>The only thing that changes between the gmond.conf's is the > >> > >>"name" and > >> > >>>"mcast_channel", all other settings are the same. > >>>As for the gmetad.conf, here's the crucial part of it: > >>> > >>>data_source "Head Cluster" 300 10.10.1.19 data_source "Storage" 300 > >>>10.10.1.1 data_source "Monitor" 300 localhost > > >

