Hi there
I'm a rookie to ganglia and need the advices for the following.
I have created a 'master' server with gmetad, gmond and webserver.
With this I have also installed gmond on 114 nodes separately. However, 92 of
these nodes are able to establish connection with gmetad while the remaining 22
nodes have gone "MIA". gmond is definitely running but these 22 nodes just
simply refused to connects up with the gmetad.
The gmond config setting are all the same default setting used across all nodes
and the only thing I can guess of is perhaps the difference in the IP subnet?
92 nodes are on '10.192.64.XX' and '10.192.65.XX' whilst the others of the 22
nodes are on '10.192.84.XX' and '10.192.85.XX'. If that is the case, may I know
what is the solution to this?
Here are the config setting for my gmetad.conf and gmond.conf:
gmetad -
data_source "GRID_UK_PROD_CLUSTER" grd4001a.gdc.abcxyz.com:8649
gmond -
/* This configuration is as close to 2.5.x default behavior as possible
The values closely match ./gmond/metric.h definitions in 2.5.x */
globals {
daemonize = yes
setuid = yes
user = ganglia
debug_level = 0
max_udp_msg_len = 1472
mute = no
deaf = no
host_dmax = 0 /*secs */
cleanup_threshold = 300 /*secs */
gexec = no
}
/* If a cluster attribute is specified, then all gmond hosts are wrapped inside
* of a <CLUSTER> tag. If you do not specify a cluster tag, then all <HOSTS>
will
* NOT be wrapped inside of a <CLUSTER> tag. */
cluster {
name = "GRID_UK_PROD_CLUSTER"
owner = "Enterprise Grid Support"
latlong = "unspecified"
url = "unspecified"
}
/* The host section describes attributes of the host, like the location */
host {
location = "GDC"
}
/* Feel free to specify as many udp_send_channels as you like. Gmond
used to only support having a single channel */
udp_send_channel {
mcast_join = 239.2.11.71
port = 8649
ttl = 1
}
/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
mcast_join = 239.2.11.71
port = 8649
bind = 239.2.11.71
}
/* You can specify as many tcp_accept_channels as you like to share
an xml description of the state of the cluster */
tcp_accept_channel {
port = 8649
Cheers!
Joseph
------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
Ganglia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-developers