[ https://issues.apache.org/jira/browse/MAPREDUCE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794280#action_12794280 ]
Dick King commented on MAPREDUCE-1295: -------------------------------------- This is a comment on the previous comment timestamped 23/Dec/09 02:59AM. * I will make all the {{DeskewedJobTraceReader}} constructors take a {{JobTraceReader}} . I see no reason why {{DeskewedJobTraceReader}} constructors shouldn't be public. * I will use {{FileSystem::exists}} . * I didn't notice that API when I wrote this code. My bad. * I don't want to require an empty directory because we use the output directory as the temp directory if none is supplied, and requiring an empty directory would make it impossible to do two runs into the same output directory. * I'll change the wording on some of the debug messages. I'll consider going to LOG4J . * I stand by my decision to use {{deletees}} . I don't like the {{deleteOnExit}} idiom because it makes it hard to use a tool as a callable component. > We need a job trace manipulator to build gridmix runs. > ------------------------------------------------------ > > Key: MAPREDUCE-1295 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1295 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Reporter: Dick King > Assignee: Dick King > Attachments: mapreduce-1295--2009-12-17.patch, > mapreduce-1295--2009-12-21.patch, mapreduce-1295--2009-12-22.patch, > mapreduce-1297--2009-12-14.patch > > > Rumen produces "job traces", which are JSON format files describing important > aspects of all jobs that are run [successfully or not] on a hadoop map/reduce > cluster. There are two packages under development that will consume these > trace files and produce actions in that cluster or another cluster: gridmix3 > [see jira MAPREDUCE-1124 ] and Mumak [a simulator -- see MAPREDUCE-728 ]. > It would be useful to be able to do two things with job traces, so we can run > experiments using these two tools: change the duration, and change the > density. I would like to provide a "folder", a tool that can wrap a > long-duration execution trace to redistribute its jobs over a shorter > interval, and also change the density by duplicating or culling away jobs > from the folded combined job trace. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.