[jira] Commented: (HADOOP-1986) Add support for a general serialization mechanism for Map Reduce

Doug Cutting (JIRA) Tue, 30 Oct 2007 10:38:15 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538882
 ]


Doug Cutting commented on HADOOP-1986:
--------------------------------------

> I said that wouldn't work because you will likely want a singleton 
> deserializer object to handle deserializing more than one class [...]

I was with you to that point.  Why must you have a singleton serializer 
instance that handles more than one class?  If the deserializer does not need 
to know the class (e.g., Java serialization) then a singleton factory can be 
used.  But if the deserializer does need to know the class, either to create an 
instance or for deserialization itself, then a different factory instance would 
need to be created per class.  These could be cached by the framework, so no 
per-deserialized-object allocations need happen.  The client (e.g., 
SequenceFile) can reuse serializers, so they need not be allocated per object 
either.

> But it adds to my argument that you want to have separate deserialize methods 
> and let the client call the right one.

So would clients like SequenceFile and the mapreduce shuffle require different 
code to deserialize different classes?  We need to have generic client code.

> Again, my point is that deserializers for Thrift and Record I/O cannot create 
> objects themselves and will always require the client to pass in the object 
> [...]

Again, I don't see why Record I/O, where we control the code generation from an 
IDL, cannot generate a no-arg ctor.  Similarly for Thrift.  The ctor does not 
have to be public.  We already bypass protections when we create instances.

> Add support for a general serialization mechanism for Map Reduce
> ----------------------------------------------------------------
>
>                 Key: HADOOP-1986
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1986
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.16.0
>
>         Attachments: SerializableWritable.java, serializer-v1.patch
>
>
> Currently Map Reduce programs have to use WritableComparable-Writable 
> key-value pairs. While it's possible to write Writable wrappers for other 
> serialization frameworks (such as Thrift), this is not very convenient: it 
> would be nicer to be able to use arbitrary types directly, without explicit 
> wrapping and unwrapping.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1986) Add support for a general serialization mechanism for Map Reduce

Reply via email to