[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12922873#action_12922873
 ] 

Ranjit Mathew commented on MAPREDUCE-2137:
------------------------------------------

Thanks for doing this. Some comments:
* We'll have to update the documentation accordingly, but perhaps only after 
MAPREDUCE-1931 is committed.
* I prefer changing "JOBNAMEPREFIX" to "JOBNAME_PREFIX" or even "JOB_NAME_PFX" 
so that it's more readable. Ditto for "ORIGJOBID" to "ORIG_JOBID" or 
"ORIG_JOB_ID".
* The {{StringBuilder}} in {{initialValue()}} has an _initial capacity_ of 64, 
but the comment makes it seem as if we're talking about the _total capacity_. I 
suggest dropping that comment.
* I know that we're not making the GridMix job-name's format a contract, but do 
you think it makes sense to check that the job has an expected format in the 
unit-test? (Since Rumen does not generate traces containing the values 
corresponding to "gridmix.job.id.original", the job name is the only link back 
to the original job if you're looking at a Rumen-generated trace.)

> Mapping between Gridmix jobs and the corresponding original MR jobs is needed
> -----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2137
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2137
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>    Affects Versions: 0.22.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.22.0
>
>         Attachments: 2137.patch
>
>
> Consider a trace file "trace1" obtained by running Rumen on a set of MR jobs' 
> history logs. When gridmix runs simulated jobs from "trace1", it may skip 
> some of the jobs from the trace file for some reason like out-of-order-jobs. 
> Now use Rumen to generate trace2 from the history logs of gridmix's simulated 
> jobs.
> Now, to compare and analyze the gridmix's simulated jobs with original MR 
> jobs, we need a mapping between them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to