[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921296#action_12921296
 ] 

Ravi Gummadi commented on MAPREDUCE-2137:
-----------------------------------------

In gridmix's simulated job's configuration, a property 
"gridmix.job.name.original" is set to the original job's jobID. But this config 
property name is misleading. I am proposing that we will have 2 config 
properties
(1) "gridmix.job.name.original" that contains the original job's jobName and
(2) "gridmix.job.id.original" that contains the original job's jobID

But these properties can't go into the new trace files generated by Rumen and 
thus comparing trace1 and trace2(of "Description" of this JIRA) is still an 
issue.

I propose that we change the gridmix simulated jobs' name from 
GRIDMIX<5digitsSequenceNumber>
to
GRIDMIX<6digitsSequenceNumber>_<originalJobID>

This will give us a simple mapping between gridmix's simulated jobs and their 
corresponding original MR jobs.

Note that the sequenceNumber is also getting changed from 5 digits to 6 digits 
sothat one gridmix run can have more number of simulated jobs.

Thoughts ?

> Mapping between Gridmix jobs and the corresponding original MR jobs is needed
> -----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2137
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2137
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>    Affects Versions: 0.22.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.22.0
>
>
> Consider a trace file "trace1" obtained by running Rumen on a set of MR jobs' 
> history logs. When gridmix runs simulated jobs from "trace1", it may skip 
> some of the jobs from the trace file for some reason like out-of-order-jobs. 
> Now use Rumen to generate trace2 from the history logs of gridmix's simulated 
> jobs.
> Now, to compare and analyze the gridmix's simulated jobs with original MR 
> jobs, we need a mapping between them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to