[jira] Commented: (HADOOP-1965) Handle map output buffers better

Devaraj Das (JIRA) Fri, 23 Nov 2007 03:59:05 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544999
 ]


Devaraj Das commented on HADOOP-1965:
-------------------------------------

Some indentation needs to be fixed (the patch has quite a few lines where the 
only change is the indentation for the second line of a code statement). Also, 
some documentation should be put around the fact that there are two buffers, 
one which sort works on, and another that collect works on, the switching of 
the buffers, etc.  The benchmark assumes RandomWriter to be there in the 
job-jar but, since the benchmark is part of the test jar, this is not true, 
unless the user generates a new jar file containing the randomwriter classes. 
Maybe you should implement the data generation part of the benchmark within the 
benchmark.

> Handle map output buffers better
> --------------------------------
>
>                 Key: HADOOP-1965
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1965
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Devaraj Das
>            Assignee: Amar Kamat
>             Fix For: 0.16.0
>
>         Attachments: 1965_single_proc_150mb_gziped.jpeg, 
> 1965_single_proc_150mb_gziped.pdf, 1965_single_proc_150mb_gziped_breakup.png, 
> HADOOP-1965-1.patch, HADOOP-1965-Benchmark.patch
>
>
> Today, the map task stops calling the map method while sort/spill is using 
> the (single instance of) map output buffer. One improvement that can be done 
> to improve performance of the map task is to have another buffer for writing 
> the map outputs to, while sort/spill is using the first buffer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1965) Handle map output buffers better

Reply via email to