[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1295:
-------------------------------------

    Status: Open  (was: Patch Available)

The test failures are known (MAPREDUCE-1311, MAPREDUCE-1312).

Only a few minor nits:
* The two types of DeskewedJobTraceReader constructors could be combined by 
adding a private/protected cstr with a JobTraceReader formal
* Should the System.err messages in DJTR::nextJob be debug messages?
* The open/close idiom in {{run}} may be replaced by {{FileSystem::exists}}. 
Alternatively, require the user to provide a clean directory and use sequential 
segment numbering
* The first person is a little disorienting in the debug log messages- which 
could use log4j loggers- but whatever you prefer
* The {{deletees}} and similar accounting can be replace with 
{{FileSystem::deleteOnExit}}

> We need a job trace manipulator to build gridmix runs.
> ------------------------------------------------------
>
>                 Key: MAPREDUCE-1295
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1295
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Dick King
>            Assignee: Dick King
>         Attachments: mapreduce-1295--2009-12-17.patch, 
> mapreduce-1295--2009-12-21.patch, mapreduce-1295--2009-12-22.patch, 
> mapreduce-1297--2009-12-14.patch
>
>
> Rumen produces "job traces", which are JSON format files describing important 
> aspects of all jobs that are run [successfully or not] on a hadoop map/reduce 
> cluster.  There are two packages under development that will consume these 
> trace files and produce actions in that cluster or another cluster: gridmix3 
> [see jira MAPREDUCE-1124 ] and Mumak [a simulator -- see MAPREDUCE-728 ].
> It would be useful to be able to do two things with job traces, so we can run 
> experiments using these two tools: change the duration, and change the 
> density.  I would like to provide a "folder", a tool that can wrap a 
> long-duration execution trace to redistribute its jobs over a shorter 
> interval, and also change the density by duplicating or culling away jobs 
> from the folded combined job trace.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to