Here's a snippet of tasktracker metrics using Metrics2. (I think there were (more) gaps in the pre-metrics2 versions.) Note that you'll need to have hadoop-env.sh and hadoop-metrics2.properties setup on all the nodes you want reports from.
1345570905436 ugi.ugi: context=ugi, hostName=sqws31.caclab.cac.cpqcorp.net, loginSuccess_num_ops=0, loginSuccess_avg_time=0.0, loginFailure_num_ops=0, loginFailure_avg_time=0.0 1345570905436 jvm.metrics: context=jvm, processName=TaskTracker, sessionId=, hostName=sqws31.caclab.cac.cpqcorp.net, memNonHeapUsedM=11.540627, memNonHeapCommittedM=18.25, memHeapUsedM=12.972412, memHeapCommittedM=61.375, gcCount=1, gcTimeMillis=6, threadsNew=0, threadsRunnable=9, threadsBlocked=0, threadsWaiting=9, threadsTimedWaiting=1, threadsTerminated=0, logFatal=0, logError=0, logWarn=0, logInfo=1 1345570905436 mapred.tasktracker: context=mapred, sessionId=, hostName=sqws31.caclab.cac.cpqcorp.net, maps_running=0, reduces_running=0, mapTaskSlots=2, reduceTaskSlots=2, tasks_completed=0, tasks_failed_timeout=0, tasks_failed_ping=0 1345570905436 rpcdetailed.rpcdetailed: context=rpcdetailed, port=33997, hostName=sqws31.caclab.cac.cpqcorp.net 1345570905436 rpc.rpc: context=rpc, port=33997, hostName=sqws31.caclab.cac.cpqcorp.net, rpcAuthenticationSuccesses=0, rpcAuthenticationFailures=0, rpcAuthorizationSuccesses=0, rpcAuthorizationFailures=0, ReceivedBytes=0, SentBytes=0, RpcQueueTime_num_ops=0, RpcQueueTime_avg_time=0.0, RpcProcessingTime_num_ops=0, RpcProcessingTime_avg_time=0.0, NumOpenConnections=0, callQueueLen=0 1345570905436 metricssystem.MetricsSystem: context=metricssystem, hostName=sqws31.caclab.cac.cpqcorp.net, num_sources=5, num_sinks=1, sink.file.latency_num_ops=0, sink.file.latency_avg_time=0.0, sink.file.dropped=0, sink.file.qsize=0, snapshot_num_ops=5, snapshot_avg_time=0.2, snapshot_stdev_time=0.447213595499958, snapshot_imin_time=0.0, snapshot_imax_time=1.0, snapshot_min_time=0.0, snapshot_max_time=1.0, publish_num_ops=0, publish_avg_time=0.0, publish_stdev_time=0.0, publish_imin_time=3.4028234663852886E38, publish_imax_time=1.401298464324817E-45, publish_min_time=3.4028234663852886E38, publish_max_time=1.401298464324817E-45, dropped_pub_all=0 1345570915435 ugi.ugi: context=ugi, hostName=sqws31.caclab.cac.cpqcorp.net 1345570915435 jvm.metrics: context=jvm, processName=TaskTracker, sessionId=, hostName=sqws31.caclab.cac.cpqcorp.net, memNonHeapUsedM=11.549316, memNonHeapCommittedM=18.25, memHeapUsedM=13.136337, memHeapCommittedM=61.375, gcCount=1, gcTimeMillis=6, threadsNew=0, threadsRunnable=9, threadsBlocked=0, threadsWaiting=9, threadsTimedWaiting=1, threadsTerminated=0, logFatal=0, logError=0, logWarn=0, logInfo=1 1345570915435 mapred.tasktracker: context=mapred, sessionId=, hostName=sqws31.caclab.cac.cpqcorp.net, maps_running=0, reduces_running=0, mapTaskSlots=2, reduceTaskSlots=2 1345570915435 rpcdetailed.rpcdetailed: context=rpcdetailed, port=33997, hostName=sqws31.caclab.cac.cpqcorp.net 1345570915435 rpc.rpc: context=rpc, port=33997, hostName=sqws31.caclab.cac.cpqcorp.net 1345570915435 metricssystem.MetricsSystem: context=metricssystem, hostName=sqws31.caclab.cac.cpqcorp.net, num_sources=5, num_sinks=1, sink.file.latency_num_ops=1, sink.file.latency_avg_time=4.0, snapshot_num_ops=11, snapshot_avg_time=0.16666666666666669, snapshot_stdev_time=0.408248290463863, snapshot_imin_time=0.0, snapshot_imax_time=1.0, snapshot_min_time=0.0, snapshot_max_time=1.0, publish_num_ops=1, publish_avg_time=0.0, publish_stdev_time=0.0, publish_imin_time=0.0, publish_imax_time=1.401298464324817E-45, publish_min_time=0.0, publish_max_time=1.401298464324817E-45, dropped_pub_all=0 1345570925435 ugi.ugi: context=ugi, hostName=sqws31.caclab.cac.cpqcorp.net 1345570925435 jvm.metrics: context=jvm, processName=TaskTracker, sessionId=, hostName=sqws31.caclab.cac.cpqcorp.net, memNonHeapUsedM=13.002403, memNonHeapCommittedM=18.25, memHeapUsedM=11.503555, memHeapCommittedM=61.375, gcCount=2, gcTimeMillis=12, threadsNew=0, threadsRunnable=9, threadsBlocked=0, threadsWaiting=13, threadsTimedWaiting=7, threadsTerminated=0, logFatal=0, logError=0, logWarn=0, logInfo=3 1345570925435 mapred.tasktracker: context=mapred, sessionId=, hostName=sqws31.caclab.cac.cpqcorp.net, maps_running=0, reduces_running=0, mapTaskSlots=2, reduceTaskSlots=2 1345570925435 rpcdetailed.rpcdetailed: context=rpcdetailed, port=33997, hostName=sqws31.caclab.cac.cpqcorp.net 1345570925435 rpc.rpc: context=rpc, port=33997, hostName=sqws31.caclab.cac.cpqcorp.net 1345570925436 mapred.shuffleOutput: context=mapred, sessionId=, hostName=sqws31.caclab.cac.cpqcorp.net, shuffle_handler_busy_percent=0.0, shuffle_output_bytes=0, shuffle_failed_outputs=0, shuffle_success_outputs=0, shuffle_exceptions_caught=0 1345570925436 metricssystem.MetricsSystem: context=metricssystem, hostName=sqws31.caclab.cac.cpqcorp.net, num_sources=6, num_sinks=1, sink.file.latency_num_ops=2, sink.file.latency_avg_time=2.0, snapshot_num_ops=18, snapshot_avg_time=0.14285714285714285, snapshot_stdev_time=0.37796447300922725, snapshot_imin_time=0.0, snapshot_imax_time=1.0, snapshot_min_time=0.0, snapshot_max_time=1.0, publish_num_ops=2, publish_avg_time=0.0, publish_stdev_time=0.0, publish_imin_time=0.0, publish_imax_time=1.401298464324817E-45, publish_min_time=0.0, publish_max_time=1.401298464324817E-45, dropped_pub_all=0 David Wong -----Original Message----- From: Mark Olimpiati [mailto:markq2...@gmail.com] Sent: Wednesday, August 29, 2012 12:54 PM To: common-user Subject: Metrics .. Hi, I enabled the "metrics.properties" to use FileContext, in which jvm metrics values are written to a file as follows: jvm.metrics: hostName= localhost, processName=MAP, sessionId=, gcCount=10, gcTimeMillis=130, logError=0, logFatal=0, logInfo=21, logWarn=0, memHeapCommittedM=180.1211, memHeapUsedM=102.630875, memNonHeapCommittedM=23.191406, memNonHeapUsedM=11.828621, threadsBlocked=0, threadsNew=0, threadsRunnable=2, threadsTerminated=0, threadsTimedWaiting=3, threadsWaiting=2 Questions: - Is this line for a single Map jvm ? as the processName=MAP. If so, why doesn't it show job-id in sessionId ??? - Even though I ran maps and reducers tasks, I only got processName=MAP /SHUFFLE, nothing for reducers why? Thank you, Mark