[
https://issues.apache.org/jira/browse/HADOOP-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536148
]
Tom White commented on HADOOP-1986:
-----------------------------------
Doug/Owen
These changes generally look good - I'll try to work them into a new patch.
In the current patch Serializers and Deserializers are stateful with open/close
methods and that was the reason that led me to separate them. We could combine
them in a single object, but this would be at the expense of muddying the
method names: (e.g. closeSerializer and closeDeserializer), so I'm reluctant to
do that - I would stick with Doug's first SerializationFactory proposal (plus
the accept method).
Another aspect that the current patch doesn't address is who instantiates
objects during deserialization. (Doug - I think you're alluding to this in the
"reuse" object in the Serializer class above?) For Writables and Thrift the
serialization framework does not instantiate objects - it merely populates the
supplied object with the representation from the stream. For Java Serialization
the serialization framework reads the type from the stream and instantiates an
object for that type. To cater for this difference we need to make the
Deserializer expose whether it can reuse types so that the client (for example
ReduceTask) knows whether to hand it an object or not. This is needed for
efficiency (so the client doesn't needlessly create objects that aren't used)
and also since some serialization frameworks don't require classes to have
no-arg constructors (so the client would not be able to create the required
object in any case).
> Add support for a general serialization mechanism for Map Reduce
> ----------------------------------------------------------------
>
> Key: HADOOP-1986
> URL: https://issues.apache.org/jira/browse/HADOOP-1986
> Project: Hadoop
> Issue Type: New Feature
> Components: mapred
> Reporter: Tom White
> Assignee: Tom White
> Fix For: 0.16.0
>
> Attachments: SerializableWritable.java, serializer-v1.patch
>
>
> Currently Map Reduce programs have to use WritableComparable-Writable
> key-value pairs. While it's possible to write Writable wrappers for other
> serialization frameworks (such as Thrift), this is not very convenient: it
> would be nicer to be able to use arbitrary types directly, without explicit
> wrapping and unwrapping.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.