[
https://issues.apache.org/jira/browse/HADOOP-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570877#action_12570877
]
Mukund Madhugiri commented on HADOOP-1986:
------------------------------------------
Tom,
I ran the Sort benchmarks on 20, 100 and 500 nodes and see that random writer
takes two times as much on 100 nodes when compared to trunk. I will do a re-run
when the cluster frees up tomorrow, to see if it is repeatable
Here is the data from the runs:
* Sort on 20 nodes:
||Job||trunk (hrs)||trunk + patch (hrs)||
|Random Writer|0.11|0.11|
|Sort|0.31|0.33|
|Sort Validation|0.15|0.15|
* Sort on 100 nodes:
||Job||trunk (hrs)||trunk + patch (hrs)||
|Random Writer|0.13|0.29|
|Sort|0.55|0.44|
|Sort Validation|0.21|0.21|
* Sort on 500 nodes:
||Job||trunk (hrs)||trunk + patch (hrs)||
|Random Writer|0.26|0.27|
|Sort|1.1|1.2|
|Sort Validation|0.21|0.23|
I see a checksum error in the JobTracker logs on the 500 node run, but see it
on the trunk run as well. So, it is not due to your patch.
> Add support for a general serialization mechanism for Map Reduce
> ----------------------------------------------------------------
>
> Key: HADOOP-1986
> URL: https://issues.apache.org/jira/browse/HADOOP-1986
> Project: Hadoop Core
> Issue Type: New Feature
> Components: mapred
> Reporter: Tom White
> Assignee: Tom White
> Fix For: 0.17.0
>
> Attachments: hadoop-serializer-v2.tar.gz, SerializableWritable.java,
> serializer-v1.patch, serializer-v2.patch, serializer-v3.patch,
> serializer-v4.patch, serializer-v5.patch
>
>
> Currently Map Reduce programs have to use WritableComparable-Writable
> key-value pairs. While it's possible to write Writable wrappers for other
> serialization frameworks (such as Thrift), this is not very convenient: it
> would be nicer to be able to use arbitrary types directly, without explicit
> wrapping and unwrapping.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.