Have you tried using gmetric from the command line, to see if it's  
your properties file versus the gmond.conf?

Last I saw, Hadoop did *not* read the gmond.conf file to get its  
destination properties. I hope I'm wrong and somebody's put together a  
more integrated configuration system, but...

The example at http://wiki.apache.org/hadoop/GangliaMetrics explains  
that you have to specify your gmond destinations in your hadoop- 
metrics.properties file; I suspect you need a comma-separated list  
there (or a multicast address) instead of just "localhost".

That is, the class definition at 
http://hadoop.apache.org/common/docs/r0.18.3/api/org/apache/hadoop/metrics/ganglia/package-summary.html#package_description
 
  reads as if you can get use a value like

jvm.servers=server1:8649,server2:8649

OTOH, the wiki page seems to suggest the developers only used  
multicast (or a single collector).  We always use multicast, too, so I  
haven't had occasion to try the multiple-server syntax.

-- ReC

On Dec 21, 2009, at 3:07 PM, Travis Ganglia wrote:

> Hey ganglia gurus -
>
> SHORT VERSION:
> Some gmetric-published stats available at a local gmond are not being
> published to other gmond instances. However, some stats are shared
> between gmond instances. Any ideas why some stats would not be shared
> while others are?
>
>
> DETAILS:
> I'm in the process of using Ganglia to monitor a Hadoop cluster and
> have encountered a strange issue. All cluster nodes have basic stats
> being collected and shared as expected (such as mem_buffers). Hadoop
> stats are being published to the local gmond as expected (such as
> dfs.datanode.blocks_verified). However, the hadoop stats are not being
> shared amongst gmond instances and thus are not collected by gmetad. I
> don't think the issue is Hadoop-specific.
>
>
> Telnetting to the local gmond and looking at its XML output I see the
> Hadoop stats as expected; for example:
>
> <METRIC NAME="dfs.datanode.blocks_verified" VAL="18364" TYPE="int32"
> UNITS="" TN="3" TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
>
>
> Looking at another gmond (listed in udp_send_channel) we see the host
> is present an up-to-date. For example:
>
> <HOST NAME="foo.bar.com" IP="1.2.3.4" REPORTED="1261436155" TN="2"
> TMAX="20" DMAX="86400" LOCATION="1" GMOND_STARTED="1261434025">
> <METRIC NAME="mem_buffers" VAL="336572" TYPE="uint32" UNITS="KB"
> TN="2" TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond"/>
>
> However, the dfs.datanode.blocks_verified statistic is not present.
> Any tips on how to troubleshoot this would be extremely helpful as I'm
> not sure why a subset of metrics would be shared.
>
> Thanks!
> Travis
>
> ------------------------------------------------------------------------------
> This SF.Net email is sponsored by the Verizon Developer Community
> Take advantage of Verizon's best-in-class app development support
> A streamlined, 14 day to market process makes app distribution fast  
> and easy
> Join now and get one step closer to millions of Verizon customers
> http://p.sf.net/sfu/verizon-dev2dev
> _______________________________________________
> Ganglia-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/ganglia-general


------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to