getting counters from specific hadoop jobs

Bart Vandewoestyne Thu, 23 Oct 2014 04:34:26 -0700

Hello list,

I order to learn about Hadoop performance tuning, I am currentlyinvestigating the effect of certain Hadoop configuration parameters oncertain Hadoop counters. I would like to do something like thefollowing (from the command line):


for some_config_parameter in set_of_config_values

  Step 1) run hadoop job with 'hadoop jar ....'

Step 2) once job finished, get the value of one or more Hadoopcounters of this job

I know that I can achieve step 2 with the -counter option of the mapredjob command:


bart@sandy-quad-1:~$ mapred job -counter
Usage: CLI [-counter <job-id> <group-name> <counter-name>]

However, I need to specify a job-id here, and that is where I'm havingtrouble... I don't know an easy way to get the job-id from the hadoopjob that I started in Step 1. I also don't know of a way to specify ajob-id myself in Step 1 so that I can use it later in Step 2.

I cannot imagine I'm the only one trying to run jobs and requesting someof the counters afterwards. How is this typically solved?

Note that I'm looking for a command-line solution, something that isscriptable bash or so.


Thanks,
Bart

getting counters from specific hadoop jobs

Reply via email to