I have just installed Ganglia-3.0.5.  Configured without gexec on both
machines and with gmetad on one, but not the other.  I am able to start
gmond and gmetad with no errors.  But I am having problems on one of my
machines with gmond.

I'm far from an expert and would appreciate a pointer or two.  I've
searched the mail archives, read the gmond man-page, and Googled the
ARPANet, but haven't found anything except this posting
http://www.mail-archive.com/[email protected]/msg02601.html.
  The poster had an error in his conf file, but didn't post what the error was.

Below is what I believe to be the pertinent information.  But if there
is more that is needed, I will post it.

Issuing gstat on the head node gives the following:

        CLUSTER INFORMATION
        Name: clusterfsck
        Hosts: 1
        Gexec Hosts: 0
        Dead Hosts: 0
        Localtime: Sat Oct 27 16:34:15 2007

        There are no hosts running gexec at this time

When I gstat -i 172.16.1.101 from the head node, I get the following and
the gmond daemon is killed on 172.16.1.101.

        gexec_cluster() XML_ParseBuffer() error at line 51:
        no element found

        Unable to get hostlist from 172.16.1.101 8649!

SSHing into 172.16.1.101 and issuing a gstat gives the same error and
kills the gmond daemon.

        gexec_cluster() XML_ParseBuffer() error at line 51:
        no element found

        Unable to get hostlist from localhost 8649!

I am using essentially the default configuration on both machines as
created by:

        gmond --default-config > /etc/gmond.conf.

<SNIP>
/* This configuration is as close to 2.5.x default behavior as possible 
   The values closely match ./gmond/metric.h definitions in 2.5.x */ 
globals {                    
  daemonize = yes              
  setuid = no             
  user = nobody              
  debug_level = 0               
  max_udp_msg_len = 1472        
  mute = no             
  deaf = no             
  host_dmax = 0 /*secs */ 
  cleanup_threshold = 300 /*secs */ 
  gexec = no             
} 

/* If a cluster attribute is specified, then all gmond hosts are wrapped
inside 
 * of a <CLUSTER> tag.  If you do not specify a cluster tag, then all
<HOSTS> will 
 * NOT be wrapped inside of a <CLUSTER> tag. */ 
cluster { 
  name = "clusterfsck" 
  owner = "The ReliaFree Project" 
  latlong = "unspecified" 
  url = "http://reliafree.sourceforge.net"; 
} 

/* The host section describes attributes of the host, like the location
*/ 
host { 
  location = "unspecified" 
} 

/* Feel free to specify as many udp_send_channels as you like.  Gmond 
   used to only support having a single channel */ 
udp_send_channel { 
  mcast_join = 239.2.11.71
  port = 8649 
  ttl = 1
} 

/* You can specify as many udp_recv_channels as you like as well. */ 
udp_recv_channel { 
  mcast_join = 239.2.11.71
  port = 8649
  bind = 239.2.11.71
} 

/* You can specify as many tcp_accept_channels as you like to share 
   an xml description of the state of the cluster */ 
tcp_accept_channel { 
  port = 8649 
} 
</SNIP>

TIA,

Andrew

-- 
Andrew "Weibullguy" Rowland
Reliability & Safety Engineer

[EMAIL PROTECTED]
http://webpages.charter.net/weibullguy
http://reliafree.sourceforge.net

Attachment: signature.asc
Description: This is a digitally signed message part

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to