On 09/05/11 22:02, Thomas Jungblut (JIRA) wrote:
[
https://issues.apache.org/jira/browse/HAMA-363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030895#comment-13030895
]
Thomas Jungblut commented on HAMA-363:
--------------------------------------
As far as I know Hadoop only provides some JVM metrics and host metrices. I
don't exactly find the correct source code position, but I think we should
implement our own metrics package, which we can later add to ganglia. This is
much more useful.
We should define things we need to determine whether there are problems or not.
Something like: "We ping every groom every 5 seconds and check the latency."
This can be easily implemented in BSPMaster.
To measure the IN and OUT rate or other fancy stuff we need something like
heartbeat communication that will transfer the local groom data to the master.
This should be in the newer versions of Hadoop>0.21 shouldn't it? Don't have
the source codes haging around here.
if you are doing perf stuff, I'd go for having some plugin monitoring
that can go in before/after communications.
Why? I'm playing with sFlow monitoring of bits of Hadoop, and it's
tricky to retrofit this stuff deep into the code. If the hooks where
there it's easier.