Have you tried using gmetric from the command line, to see if it's your properties file versus the gmond.conf?
Last I saw, Hadoop did *not* read the gmond.conf file to get its destination properties. I hope I'm wrong and somebody's put together a more integrated configuration system, but... The example at http://wiki.apache.org/hadoop/GangliaMetrics explains that you have to specify your gmond destinations in your hadoop- metrics.properties file; I suspect you need a comma-separated list there (or a multicast address) instead of just "localhost". That is, the class definition at http://hadoop.apache.org/common/docs/r0.18.3/api/org/apache/hadoop/metrics/ganglia/package-summary.html#package_description reads as if you can get use a value like jvm.servers=server1:8649,server2:8649 OTOH, the wiki page seems to suggest the developers only used multicast (or a single collector). We always use multicast, too, so I haven't had occasion to try the multiple-server syntax. -- ReC On Dec 21, 2009, at 3:07 PM, Travis Ganglia wrote: > Hey ganglia gurus - > > SHORT VERSION: > Some gmetric-published stats available at a local gmond are not being > published to other gmond instances. However, some stats are shared > between gmond instances. Any ideas why some stats would not be shared > while others are? > > > DETAILS: > I'm in the process of using Ganglia to monitor a Hadoop cluster and > have encountered a strange issue. All cluster nodes have basic stats > being collected and shared as expected (such as mem_buffers). Hadoop > stats are being published to the local gmond as expected (such as > dfs.datanode.blocks_verified). However, the hadoop stats are not being > shared amongst gmond instances and thus are not collected by gmetad. I > don't think the issue is Hadoop-specific. > > > Telnetting to the local gmond and looking at its XML output I see the > Hadoop stats as expected; for example: > > <METRIC NAME="dfs.datanode.blocks_verified" VAL="18364" TYPE="int32" > UNITS="" TN="3" TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/> > > > Looking at another gmond (listed in udp_send_channel) we see the host > is present an up-to-date. For example: > > <HOST NAME="foo.bar.com" IP="1.2.3.4" REPORTED="1261436155" TN="2" > TMAX="20" DMAX="86400" LOCATION="1" GMOND_STARTED="1261434025"> > <METRIC NAME="mem_buffers" VAL="336572" TYPE="uint32" UNITS="KB" > TN="2" TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond"/> > > However, the dfs.datanode.blocks_verified statistic is not present. > Any tips on how to troubleshoot this would be extremely helpful as I'm > not sure why a subset of metrics would be shared. > > Thanks! > Travis > > ------------------------------------------------------------------------------ > This SF.Net email is sponsored by the Verizon Developer Community > Take advantage of Verizon's best-in-class app development support > A streamlined, 14 day to market process makes app distribution fast > and easy > Join now and get one step closer to millions of Verizon customers > http://p.sf.net/sfu/verizon-dev2dev > _______________________________________________ > Ganglia-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/ganglia-general ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

