[
https://issues.apache.org/jira/browse/MAPREDUCE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921296#action_12921296
]
Ravi Gummadi commented on MAPREDUCE-2137:
-----------------------------------------
In gridmix's simulated job's configuration, a property
"gridmix.job.name.original" is set to the original job's jobID. But this config
property name is misleading. I am proposing that we will have 2 config
properties
(1) "gridmix.job.name.original" that contains the original job's jobName and
(2) "gridmix.job.id.original" that contains the original job's jobID
But these properties can't go into the new trace files generated by Rumen and
thus comparing trace1 and trace2(of "Description" of this JIRA) is still an
issue.
I propose that we change the gridmix simulated jobs' name from
GRIDMIX<5digitsSequenceNumber>
to
GRIDMIX<6digitsSequenceNumber>_<originalJobID>
This will give us a simple mapping between gridmix's simulated jobs and their
corresponding original MR jobs.
Note that the sequenceNumber is also getting changed from 5 digits to 6 digits
sothat one gridmix run can have more number of simulated jobs.
Thoughts ?
> Mapping between Gridmix jobs and the corresponding original MR jobs is needed
> -----------------------------------------------------------------------------
>
> Key: MAPREDUCE-2137
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2137
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/gridmix
> Affects Versions: 0.22.0
> Reporter: Ravi Gummadi
> Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
>
> Consider a trace file "trace1" obtained by running Rumen on a set of MR jobs'
> history logs. When gridmix runs simulated jobs from "trace1", it may skip
> some of the jobs from the trace file for some reason like out-of-order-jobs.
> Now use Rumen to generate trace2 from the history logs of gridmix's simulated
> jobs.
> Now, to compare and analyze the gridmix's simulated jobs with original MR
> jobs, we need a mapping between them.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.