Mark,

What does the output from 'gstat' return?

--tjn

 _________________________________________________________________________
  Thomas Naughton                                      [EMAIL PROTECTED]
  Research Associate                                   (865) 576-4184


On Thu, 5 Sep 2002, Mark Horner wrote:

> Hi,
> 
> ok we have made significant progress I think. If i am on the manager and 
> telnet to localhost 8649 I get xml data. And if I am on my node I get 
> local xml data. I have turned pfilter off on both and still no luck when I 
> try from one to the other.
> 
> I restart gmond on both and then refresh my ganglia page. Am I leaving 
> something out?
> 
> Would an upgrade of ganglia be the best way to go - might fix whatever is 
> wrong even if I never find out what it was. I would like to figure this 
> out though?
> 
> Could you give me a laymans explanation of how the mulicast cahnnel 
> relates to the IPs I chose for my cluster (192.168.1.[1-21])? Could this 
> be a probelm - I just left it on default?
> 
> Mark
> 
> 
> 
> 
> On Thu, 5 Sep 2002, Joe Griffin wrote:
> 
> > Mark,
> > 
> > > thanks for the help. Unfortunately my telnet issues have not improved.
> > > I created a gmond.conf file in /etc which has all the suggested  inputs. 
> > > And restarted the service. No luck.
> > 
> > Don't thank me until it works :-)
> > 
> > I did not see.  Can you telnet w/ the 8649 to yourself?
> > 
> > Which version of Ganglia are you running?  I
> > have:
> > 
> > virtue:82) rpm -q ganglia-monitor-core
> > ganglia-monitor-core-2.4.1-1
> > 
> > 
> > I believe older versions required the mcast_if to be
> > set in /etc/init.d/gmond:
> > 
> >     deamon $GMOND  --mcast_if=eth0
> > 
> > 
> > 
> > > I put the same file on bambino1 one and tried after restarting the 
> > > service and no luck. I get the same negative response when I try to 
> > > telnet to 8649 on any of the machines from any other one.
> > > 
> > > I am getting what I consider anomolous behavious when I start and stop the 
> > > service - see below:
> > > 
> > > root@qgp3:/etc>service gmond start
> > > Starting GANGLIA gmond:                                    [  OK  ]
> > > root@qgp3:/etc>service gmond stop
> > > Shutting down GANGLIA gmond: /etc/init.d/gmond: kill: (7394) - No such 
> > > process
> > > /etc/init.d/gmond: kill: (7393) - No such process
> > > /etc/init.d/gmond: kill: (7392) - No such process
> > > /etc/init.d/gmond: kill: (7391) - No such process
> > > /etc/init.d/gmond: kill: (7390) - No such process
> > > /etc/init.d/gmond: kill: (7389) - No such process
> > > /etc/init.d/gmond: kill: (7388) - No such process
> > > /etc/init.d/gmond: kill: (7387) - No such process
> > > /etc/init.d/gmond: kill: (7386) - No such process
> > >                                                            [  OK  ]
> > > root@qgp3:/etc>service gmond start
> > > Starting GANGLIA gmond:                                    [  OK  ]
> > > root@qgp3:/etc>service gmond stop
> > > Shutting down GANGLIA gmond:                               [  OK  ]
> > > root@qgp3:/etc>service gmond start
> > > Starting GANGLIA gmond:                                    [  OK  ]
> > > root@qgp3:/etc>
> > 
> > 
> > Huh?
> > 
> > Are you saying that sometimes you start/stop and get
> > an error, and sometimes you start/stop and do not
> > get the error?
> > 
> > You might try turning up the "debug."  Perhaps that
> > will give you more information:
> > 
> > # Run gmond in "debug" mode.  Gmond will not background.  Debug messages
> > # are sent to stdout.  Value from 0-100.  The higher the number the more
> > # detailed debugging information will be sent.
> > # default: 0
> > # debug_level 10
> > 
> > Another possibility (but I think it's a long
> > shot) is to set a static route for the ganglia multicast channel:
> > 
> > route add -host 239.2.11.71 dev eth0
> > 
> > Joe
> > 
> > 
> > # Run gmond in "debug" mode.  Gmond will not background.  Debug messages
> > # are sent to stdout.  Value from 0-100.  The higher the number the more
> > # detailed debugging information will be sent.
> > # default: 0
> > # debug_level 10
> > 
> > > 
> > > Another thing that worries me is that the gmond -help mentions nothing 
> > > about config files?
> > > 
> > > Any other things I might try?
> > > 
> > > I have the following in my gmond.conf file:
> > > 
> > >  setuid nobody
> > >  all_trusted on
> > >  mcast_channel 239.2.11.71
> > >  mcast_port 8649
> > >  mcast_ttl 1
> > >  mcast_threads 2
> > >  xml_port 8649
> > >  num_nodes 22
> > >  xml_threads 2
> > > 
> > > Thanks, 
> > > 
> > > Mark
> > > 
> > > 
> > > On Thu, 5 Sep 2002, Joe Griffin wrote:
> > > 
> > > 
> > >>Hi Mark,
> > >>
> > >>I have four comments:
> > >>
> > >>1. Is gmond running on bambino1 as well as on the headnode?
> > >>    The "telnet bambino1 8649" should produce output like:
> > >>
> > >>    virtue:81) telnet msc1 8649
> > >>Trying 192.168.3.21...
> > >>Connected to msc1.
> > >>Escape character is '^]'.
> > >><?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
> > >><!DOCTYPE GANGLIA_XML [
> > >>    <!ELEMENT GANGLIA_XML (CLUSTER)+>
> > >>    <!ATTLIST GANGLIA_XML VERSION CDATA #REQUIRED
> > >>                          SOURCE  CDATA #REQUIRED>
> > >>    <!ELEMENT CLUSTER (HOST)+>
> > >>    <!ATTLIST CLUSTER NAME  CDATA #REQUIRED
> > >>                      LOCALTIME CDATA #REQUIRED>
> > >>
> > >>     ... lines deleted ...
> > >>
> > >>2. If you are logged on the headnode, can you "telnet $HEADNODE 8649"?
> > >>
> > >>3. If you are on bambino1, can you "telnet bambino1 8649"?
> > >>
> > >>4. You mentioned "gmond -ieth1".  I assume eth1 is the NIC
> > >>    connecting to your cluster.  If so, have you put the
> > >>    following in /etc/gmond.conf:
> > >>
> > >>     mcast_if  eth1
> > >>
> > >>     Then restart the deamons:
> > >>
> > >>    /etc/init.d/gmond stop
> > >>    /etc/init.d/gmond start
> > >>
> > >>
> > >>You should be able to an telnet to your headnode (my #2) with
> > >>the 8649 and see all the attached nodes.  If you cannot
> > >>it is either because the compute nodes are NOT running
> > >>gmond (my #1) or the gmond on the headnode can't see the
> > >>gmond on the compute nodes (my #4).  Trying to do the
> > >>telnet from bambino1 will let you know if gmond is
> > >>running correctly on it.
> > >>
> > >>
> > >>Regards,
> > >>Joe Griffin
> > >>MSC.Software
> > >>
> > >>
> > >>
> > >>Mark Horner wrote:
> > >>
> > >>>Hi,
> > >>>
> > >>>Ganglia only shows my head node.
> > >>>
> > >>>I am using oscar 1.4b4 on RH 7.3. I have checked that gmond is running on my 
>nodes 
> > >>>and on the manager - I have tried gmond -ieth1 to no avail.
> > >>>
> > >>>In a previous posting someone suggested telneting to port 8649 and that a 
> > >>>stream of xml data should be visible - this isn't the case :
> > >>>
> > >>>
> > >>>
> > >>>>telnet bambino1 8649
> > >>>
> > >>>Trying 192.168.1.2...
> > >>>Connected to bambino1.phy.uct.ac.za (192.168.1.2).
> > >>>Escape character is '^]'.
> > >>>Connection closed by foreign host.
> > >>>
> > >>>Any suggestions - could it be firewall issue?
> > >>>
> > >>
> > >>
> > >>
> > > 
> > 
> > 
> > 
> 
> -- 
> Mark Horner
> 
> Physics Department
> University of Cape Town
> Rondebosch
> 7700
> South Africa
> 
> Phone: +27 21 650 3366 (office)
> Phone: +27 83 564 6272 (cellular)
> Fax:   +27 21 650 3342
> 
> 
> 
> -------------------------------------------------------
> This sf.net email is sponsored by: OSDN - Tired of that same old
> cell phone?  Get a new here for FREE!
> https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390
> _______________________________________________
> Oscar-users mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/oscar-users
> 




-------------------------------------------------------
This sf.net email is sponsored by: OSDN - Tired of that same old
cell phone?  Get a new here for FREE!
https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390
_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to