Hi Johan,
You need to use package org.apache.hadoop.metrics.ganglia inorder to
measure the hadoop metrics using Ganglia. Since it is built-in with
hadoop code itself, you don;t need any extra jar for ganglia
implementation.
First you need to have a ganglia setup running on your cluster in which
gmond will be collecting all your metrics . The configuration file for
gmond is /etc/gmond.conf . For integrating Ganglia into Hadoop, you need
to make modification to hadoop-metrics.properties file.
By default, hadoop-metrics.properties will be using NullContext . You
need to modify it to use GangliaContext as given below.
# Configuration of the "dfs" context for ganglia
dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
dfs.period=10
dfs.servers=localhost:8649
# Configuration of the "mapred" context for ganglia
mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
mapred.period=10
mapred.servers=localhost:8649
Once you started running your hadoop job, hadoop.metrics.ganglia package
will start sending your hadop metrics ( DFS metrics to
dfs.servers=localhost:8649 && MAPRED metrics to
mapred.servers=localhost:8649 ). If you want to send this metrics to
some other gmond, just change it to dfs.servers=remote_server1.com:8649
& mapred.servers=remote_server2.com:8649 . Your entire hadoop metrics
will be collected on $remote_server1.com and $remote_server2.com and you
can use gmetad to pull this information from this server.
Hope I answered your question. If anything is not clear, let us know .
Thanks,
/Aroop
Johan Oskarsson wrote:
Hi.
Could some kind soul write a short howto on the hadoop wiki on using the
Hadoop -> Ganglia metrics component?
I've tried to set it up as the javadoc suggests but no luck. I'm no
Ganglia expert though, so perhaps I need to change something on that end?
/Johan