[
https://issues.apache.org/jira/browse/HADOOP-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537831
]
Vivek Ratan commented on HADOOP-1986:
-------------------------------------
Things get difficult if you want to use a singleton serializer for more than
one class. In your example, suppose that _RecordSerializer_ is the Record I/O
serializer, and can serialize any class that derives from _Record_. If I want
to use Record I/O to serialize all my classes (my key, my value, my
intermediate key, my intermediate value, etc), then with your scheme, we'd
create one _RecordSerializer_ object per class that we want to serialize, so
one for my intermediate map keys, one for my intermediate map values, and so
on. As we've discussed earlier, serializer objects can contain state (an input
or output stream, that they keep open across each serialization, for example).
So having multiple _RecordSerializer_ objects can be a problem, especially if
more than one serializes to the same stream. It's quite plausible that we may
want a singleton _RecordSerializer_ object. Well,if it can only store one class
in its private _recordClass_ variable, then i can't use a singleton object to
serialize multiple classes.
All I'm saying is that if we associate one serializer object with one class, we
lose the ability to share serializer objects across classes, which seems quite
stifling. And I'm also arguing that we do want clients to explicitly code for
the two different kinds of serializers so that memory management is clearer, as
also performance impact (it's good to know who is responsible for creating what
objects so we can minimize object creation).
> Add support for a general serialization mechanism for Map Reduce
> ----------------------------------------------------------------
>
> Key: HADOOP-1986
> URL: https://issues.apache.org/jira/browse/HADOOP-1986
> Project: Hadoop
> Issue Type: New Feature
> Components: mapred
> Reporter: Tom White
> Assignee: Tom White
> Fix For: 0.16.0
>
> Attachments: SerializableWritable.java, serializer-v1.patch
>
>
> Currently Map Reduce programs have to use WritableComparable-Writable
> key-value pairs. While it's possible to write Writable wrappers for other
> serialization frameworks (such as Thrift), this is not very convenient: it
> would be nicer to be able to use arbitrary types directly, without explicit
> wrapping and unwrapping.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.