Hey Rick -

Thanks for the tip! Your suggestion was exactly the issue and I've
updated the hadoop metrics config to specify a comma-separated list of
the first few hosts in my hadoop data_source. Works like a charm now.

My key misunderstanding was thinking a local metric was published to a
local gmond, which was responsible for the metric to every
udp_send_channel. I wasn't aware metric publishers are responsible for
sending the metric to everyone themselves.

Thanks!
Travis


On Mon, Dec 21, 2009 at 9:16 PM, Rick Cobb <[email protected]> wrote:
>
> Have you tried using gmetric from the command line, to see if it's
> your properties file versus the gmond.conf?
>
> Last I saw, Hadoop did *not* read the gmond.conf file to get its
> destination properties. I hope I'm wrong and somebody's put together a
> more integrated configuration system, but...
>
> The example at http://wiki.apache.org/hadoop/GangliaMetrics explains
> that you have to specify your gmond destinations in your hadoop-
> metrics.properties file; I suspect you need a comma-separated list
> there (or a multicast address) instead of just "localhost".
>
> That is, the class definition at 
> http://hadoop.apache.org/common/docs/r0.18.3/api/org/apache/hadoop/metrics/ganglia/package-summary.html#package_description
>  reads as if you can get use a value like
>
> jvm.servers=server1:8649,server2:8649
>
> OTOH, the wiki page seems to suggest the developers only used
> multicast (or a single collector).  We always use multicast, too, so I
> haven't had occasion to try the multiple-server syntax.
>
> -- ReC
>
> On Dec 21, 2009, at 3:07 PM, Travis Ganglia wrote:
>
>> Hey ganglia gurus -
>>
>> SHORT VERSION:
>> Some gmetric-published stats available at a local gmond are not being
>> published to other gmond instances. However, some stats are shared
>> between gmond instances. Any ideas why some stats would not be shared
>> while others are?
>>
>>
>> DETAILS:
>> I'm in the process of using Ganglia to monitor a Hadoop cluster and
>> have encountered a strange issue. All cluster nodes have basic stats
>> being collected and shared as expected (such as mem_buffers). Hadoop
>> stats are being published to the local gmond as expected (such as
>> dfs.datanode.blocks_verified). However, the hadoop stats are not being
>> shared amongst gmond instances and thus are not collected by gmetad. I
>> don't think the issue is Hadoop-specific.
>>
>>
>> Telnetting to the local gmond and looking at its XML output I see the
>> Hadoop stats as expected; for example:
>>
>> <METRIC NAME="dfs.datanode.blocks_verified" VAL="18364" TYPE="int32"
>> UNITS="" TN="3" TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
>>
>>
>> Looking at another gmond (listed in udp_send_channel) we see the host
>> is present an up-to-date. For example:
>>
>> <HOST NAME="foo.bar.com" IP="1.2.3.4" REPORTED="1261436155" TN="2"
>> TMAX="20" DMAX="86400" LOCATION="1" GMOND_STARTED="1261434025">
>> <METRIC NAME="mem_buffers" VAL="336572" TYPE="uint32" UNITS="KB"
>> TN="2" TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond"/>
>>
>> However, the dfs.datanode.blocks_verified statistic is not present.
>> Any tips on how to troubleshoot this would be extremely helpful as I'm
>> not sure why a subset of metrics would be shared.
>>
>> Thanks!
>> Travis
>>
>> ------------------------------------------------------------------------------
>> This SF.Net email is sponsored by the Verizon Developer Community
>> Take advantage of Verizon's best-in-class app development support
>> A streamlined, 14 day to market process makes app distribution fast
>> and easy
>> Join now and get one step closer to millions of Verizon customers
>> http://p.sf.net/sfu/verizon-dev2dev
>> _______________________________________________
>> Ganglia-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/ganglia-general
>
>

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to