[
https://issues.apache.org/jira/browse/HADOOP-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681668#action_12681668
]
Steve Loughran commented on HADOOP-5469:
----------------------------------------
HtmlUnit would be the JAR to use to write tests for this
> Exposing Hadoop metrics via HTTP
> --------------------------------
>
> Key: HADOOP-5469
> URL: https://issues.apache.org/jira/browse/HADOOP-5469
> Project: Hadoop Core
> Issue Type: New Feature
> Components: metrics
> Reporter: Philip Zeyliger
> Attachments: HADOOP-5469.patch
>
>
> I'd like to be able to query Hadoop's metrics via HTTP, e.g., by going to
> "/metrics" on any Hadoop daemon that has an HttpServer. My motivation is
> pretty simple--if you're running on a lot of machines, tracking down the
> relevant metrics files is pretty time-consuming; this would be a useful
> debugging utility. I'd also like the output to be parseable, so I could
> write a quick web app to query the metrics dynamically.
> This is similar in spirit, but different, from just using JMX. (See also
> HADOOP-4756.) JMX requires a client, and, more annoyingly, JMX requires
> setting up authentication. If you just disable authentication, someone can
> do Bad Things, and if you enable it, you have to worry about yet another
> password. It's also more complete--JMX require separate instrumentation, so,
> for example, the JobTracker's metrics aren't exposed via JMX.
> To start the discussion going, I've attached a patch. I had to add a method
> to ContextFactory to get all the active MetrixContexts, implement a do-little
> MetricsContext that simply inherits from AbstractMetricsContext, add a method
> to MetricsContext to get all the records, expose copy methods for the maps in
> OutputRecord, and implemented an easy servlet. I ended up removing some
> common code from all MetricsContexts, for setting the period; I'm open to
> taking that out if it muddies the patch significantly.
> I'd love to hear your suggestions. There's a bug in the JSON representation,
> and there's some gross type-handling.
> The patch is missing tests. I wanted to post to gather feedback before I got
> too far, but tests are forthcoming.
> Here's a sample output for a job tracker, while it was running a "pi" job:
> {noformat}
> jvm
> metrics
> {hostName=doorstop.local, processName=JobTracker, sessionId=}
> gcCount=22
> gcTimeMillis=68
> logError=0
> logFatal=0
> logInfo=52
> logWarn=0
> memHeapCommittedM=7.4375
> memHeapUsedM=4.2150116
> memNonHeapCommittedM=23.1875
> memNonHeapUsedM=18.438614
> threadsBlocked=0
> threadsNew=0
> threadsRunnable=7
> threadsTerminated=0
> threadsTimedWaiting=8
> threadsWaiting=15
> mapred
> job
> {counter=Map input records, group=Map-Reduce Framework,
> hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr,
> sessionId=, user=philip}
> value=2.0
> {counter=Map output records, group=Map-Reduce Framework,
> hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr,
> sessionId=, user=philip}
> value=4.0
> {counter=Data-local map tasks, group=Job Counters ,
> hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr,
> sessionId=, user=philip}
> value=4.0
> {counter=Map input bytes, group=Map-Reduce Framework,
> hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr,
> sessionId=, user=philip}
> value=48.0
> {counter=FILE_BYTES_WRITTEN, group=FileSystemCounters,
> hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr,
> sessionId=, user=philip}
> value=148.0
> {counter=Combine output records, group=Map-Reduce Framework,
> hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr,
> sessionId=, user=philip}
> value=0.0
> {counter=Launched map tasks, group=Job Counters ,
> hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr,
> sessionId=, user=philip}
> value=4.0
> {counter=HDFS_BYTES_READ, group=FileSystemCounters,
> hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr,
> sessionId=, user=philip}
> value=236.0
> {counter=Map output bytes, group=Map-Reduce Framework,
> hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr,
> sessionId=, user=philip}
> value=64.0
> {counter=Launched reduce tasks, group=Job Counters ,
> hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr,
> sessionId=, user=philip}
> value=1.0
> {counter=Spilled Records, group=Map-Reduce Framework,
> hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr,
> sessionId=, user=philip}
> value=4.0
> {counter=Combine input records, group=Map-Reduce Framework,
> hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr,
> sessionId=, user=philip}
> value=0.0
> jobtracker
> {hostName=doorstop.local, sessionId=}
> jobs_completed=0
> jobs_submitted=1
> maps_completed=2
> maps_launched=5
> reduces_completed=0
> reduces_launched=1
> rpc
> metrics
> {hostName=doorstop.local, port=50030}
> NumOpenConnections=2
> RpcProcessingTime_avg_time=0
> RpcProcessingTime_num_ops=84
> RpcQueueTime_avg_time=1
> RpcQueueTime_num_ops=84
> callQueueLen=0
> getBuildVersion_avg_time=0
> getBuildVersion_num_ops=1
> getJobProfile_avg_time=0
> getJobProfile_num_ops=17
> getJobStatus_avg_time=0
> getJobStatus_num_ops=32
> getNewJobId_avg_time=0
> getNewJobId_num_ops=1
> getProtocolVersion_avg_time=0
> getProtocolVersion_num_ops=2
> getSystemDir_avg_time=0
> getSystemDir_num_ops=2
> getTaskCompletionEvents_avg_time=0
> getTaskCompletionEvents_num_ops=19
> heartbeat_avg_time=5
> heartbeat_num_ops=9
> submitJob_avg_time=0
> submitJob_num_ops=1
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.