[jira] Commented: (HADOOP-1986) Add support for a general serialization mechanism for Map Reduce

Mukund Madhugiri (JIRA) Wed, 20 Feb 2008 15:06:19 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570877#action_12570877
 ]


Mukund Madhugiri commented on HADOOP-1986:
------------------------------------------

Tom,
I ran the Sort benchmarks on 20, 100 and 500 nodes and see that random writer 
takes two times as much on 100 nodes when compared to trunk. I will do a re-run 
when the cluster frees up tomorrow, to see if it is repeatable

Here is the data from the runs:
* Sort on 20 nodes:
||Job||trunk (hrs)||trunk + patch (hrs)||
|Random Writer|0.11|0.11|
|Sort|0.31|0.33|
|Sort Validation|0.15|0.15|

* Sort on 100 nodes:
||Job||trunk (hrs)||trunk + patch (hrs)||
|Random Writer|0.13|0.29|
|Sort|0.55|0.44|
|Sort Validation|0.21|0.21|

* Sort on 500 nodes:
||Job||trunk (hrs)||trunk + patch (hrs)||
|Random Writer|0.26|0.27|
|Sort|1.1|1.2|
|Sort Validation|0.21|0.23|


I see a checksum error in the JobTracker logs on the 500 node run, but see it 
on the trunk run as well. So, it is not due to your patch. 

> Add support for a general serialization mechanism for Map Reduce
> ----------------------------------------------------------------
>
>                 Key: HADOOP-1986
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1986
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.17.0
>
>         Attachments: hadoop-serializer-v2.tar.gz, SerializableWritable.java, 
> serializer-v1.patch, serializer-v2.patch, serializer-v3.patch, 
> serializer-v4.patch, serializer-v5.patch
>
>
> Currently Map Reduce programs have to use WritableComparable-Writable 
> key-value pairs. While it's possible to write Writable wrappers for other 
> serialization frameworks (such as Thrift), this is not very convenient: it 
> would be nicer to be able to use arbitrary types directly, without explicit 
> wrapping and unwrapping.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1986) Add support for a general serialization mechanism for Map Reduce

Reply via email to