[ http://issues.apache.org/jira/browse/HADOOP-307?page=comments#action_12421030 ]
Doug Cutting commented on HADOOP-307: ------------------------------------- > The only reason to keep it separate is we dont want these jar files already > in classpath on all nodes. Then let's put this in src/contrib/small-job-benchmark, ok? Its build.xml should have "deploy", "test", and "clean" targets. Michel is working on a patch so that all modules in src/contrib will be compiled and tested by the top-level "test" target. > Many small jobs benchmark for MapReduce > --------------------------------------- > > Key: HADOOP-307 > URL: http://issues.apache.org/jira/browse/HADOOP-307 > Project: Hadoop > Type: Task > Components: mapred > Reporter: Sanjay Dahiya > Priority: Minor > Attachments: patch.txt > > A benchmark that runs many small MapReduce tasks in sequence. A single map > reduce implementation is used, it is invoked multiple times with input as the > output from previous run. The input to first Map is a TextInputFormat ( a > text file with few hundred KBs). Input records are passed to output without > much processing. The idea is to benchmark the time taken by initialization of > Mapper and Reducer. An initial prototyping on a single machine with 20 MR > tasks in sequence took ~47 seconds per task. Looking for suggestions on what > else can be included in the benchmark. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
