Hi there
I'm a rookie to ganglia and need the advices for the following. I have created a 'master' server with gmetad, gmond and webserver. With this I have also installed gmond on 114 nodes separately. However, 92 of these nodes are able to establish connection with gmetad while the remaining 22 nodes have gone "MIA". gmond is definitely running but these 22 nodes just simply refused to connects up with the gmetad. The gmond config setting are all the same default setting used across all nodes and the only thing I can guess of is perhaps the difference in the IP subnet? 92 nodes are on '10.192.64.XX' and '10.192.65.XX' whilst the others of the 22 nodes are on '10.192.84.XX' and '10.192.85.XX'. If that is the case, may I know what is the solution to this? Here are the config setting for my gmetad.conf and gmond.conf: gmetad - data_source "GRID_UK_PROD_CLUSTER" grd4001a.gdc.abcxyz.com:8649 gmond - /* This configuration is as close to 2.5.x default behavior as possible The values closely match ./gmond/metric.h definitions in 2.5.x */ globals { daemonize = yes setuid = yes user = ganglia debug_level = 0 max_udp_msg_len = 1472 mute = no deaf = no host_dmax = 0 /*secs */ cleanup_threshold = 300 /*secs */ gexec = no } /* If a cluster attribute is specified, then all gmond hosts are wrapped inside * of a <CLUSTER> tag. If you do not specify a cluster tag, then all <HOSTS> will * NOT be wrapped inside of a <CLUSTER> tag. */ cluster { name = "GRID_UK_PROD_CLUSTER" owner = "Enterprise Grid Support" latlong = "unspecified" url = "unspecified" } /* The host section describes attributes of the host, like the location */ host { location = "GDC" } /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.71 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.71 port = 8649 bind = 239.2.11.71 } /* You can specify as many tcp_accept_channels as you like to share an xml description of the state of the cluster */ tcp_accept_channel { port = 8649 Cheers! Joseph ------------------------------------------------------------------------------ The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb _______________________________________________ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers