[
https://issues.apache.org/jira/browse/MAPREDUCE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026120#comment-13026120
]
Ravi Gummadi commented on MAPREDUCE-2153:
-----------------------------------------
Discussed offline with Amar. There seems to be no way to mark those methods as
deprecated and remove references and not break backwards compatibility.
Backward compatibility in the sense that supporting older trace files which
contain fields and don't contain JobProperties.
So let us go ahead keeping those existing setters and getters as they are for
some time(few releases) and make sure that more setters and getters are not
added from now on.
+1 for the latest patch.
> Bring in more job configuration properties in to the trace file
> ---------------------------------------------------------------
>
> Key: MAPREDUCE-2153
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2153
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: tools/rumen
> Affects Versions: 0.23.0
> Reporter: Ravi Gummadi
> Assignee: Rajesh Balamohan
> Attachments: MR-2153-patch.txt, MapReduce-2153-trunk.patch,
> MapReduce-2153-trunk.patch, mr-2153-test-patch-results.txt
>
>
> To emulate distributed cache usage in gridmix jobs, there are 9 configuration
> properties needed to be available in trace file:
> (1) mapreduce.job.cache.files
> (2) mapreduce.job.cache.files.visibilities
> (3) mapreduce.job.cache.files.filesizes
> (4) mapreduce.job.cache.files.timestamps
> (5) mapreduce.job.cache.archives
> (6) mapreduce.job.cache.archives.visibilities
> (7) mapreduce.job.cache.archives.filesizes
> (8) mapreduce.job.cache.archives.timestamps
> (9) mapreduce.job.cache.symlink.create
> To emulate data compression in gridmix jobs, trace file should contain the
> following configuration properties:
> (1) mapreduce.map.output.compress
> (2) mapreduce.map.output.compress.codec
> (3) mapreduce.output.fileoutputformat.compress
> (4) mapreduce.output.fileoutputformat.compress.codec
> (5) mapreduce.output.fileoutputformat.compress.type
> Ideally, gridmix should set many job specific configuration properties like
> io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same
> effect of original/real job in terms of spilled records, number of merges,
> etc.
> TraceBuilder should bring in all these properties into the generated trace
> file.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira