Sorry to follow up my own post but I thought I would give it one more shot this morning and change to dfs.servers=239.2.11.71:8649 (the multicast address).
Though I am sure I tried that before, it works this time. Perhaps the Ganglia system was in some unusual state before. On 09/11/11 08:27, robert wrote: > I downloaded the latest version of Ganglia and compiled and installed > on my Hadoop cluster. Configured according to the documented > procedures. The latest stable version of Ganglia is 3.2, and I am > using hadoop-0.20.2-cdh31 > > I just copied the gmond.conf from the distribution to the nodes. It > has what look like default values 239.2.11.71 for mcast_join and port > 8649 throughout. > > The core (non hadoop) Ganglia reporting works fine, but Ganglia is not > communicating with Hadoop in any reproducible way. I got reporting on > one node once, got a *different* node reported from telnet localhost > 8649 once, but more generally get no reporting of hadoop metrics at > all! When I bounce the cluster and/or gmond I may or may not get any > difference in behavior. It is frustrating because the behavior seems > to be random and not reproducible. > > I wonder if there is a problem with version compatibility? If there > were release notes indicating a compatibility issue I didn't see them > on the ganglia site. At this point, I'm tempted to give up on Ganglia > for hadoop metrics and look for alternatives. > > Any ideas? > > > > > >
