[ https://issues.apache.org/jira/browse/HADOOP-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532998 ]
Tom White commented on HADOOP-1986: ----------------------------------- Vivek, > I'm thinking about serialization not just for key-value pairs for Map/Reduce, > but also in other places I agree that it would be useful to have a common serialization mechanism for all parts of Hadoop. The serialization mechanism proposed so far is likely to be applicable more widely since it so general - it talks in terms of input/output streams and parameterized types. This Jira issue is confined to the MapReduce part, since we have to start somewhere. I think it would be a useful exercise to think through the implications of the design for other parts of Hadoop before committing any changes though. > I don't think you want a serializer/deserializer per class. Not per concrete class, agreed. But per base class (e.g. Writable, Serializable, Thriftable, etc). > Someone still needs to implement the code for serializing/deserializing that > class and I don't see any > discussion on Hadoop support for Thrift or Record which the user can just > invoke. plus, if you think of > using this mechanism for Hadoop RPC, we will have so many instances of the > Serializer<T> interface. You're > far better off having a HadoopSerializer class that takes in any object and > automatically > serializes/deserializes it. All a user has to do is decide which > serialization platform to use. I think you pretty much describe where I would like to get to. If people are using Thrift for example (and there is a common Thrift interface) then there would be a ThriftSerializer that would just work for people, with little or no configuration. While it should still be relatively easy to write a custom serializer/deserializer, most people will use the standard ones for the standard serializing platforms. There is a question about where these serializers would go - e.g. would ThriftSerializer go in core Hadoop? > Add support for a general serialization mechanism for Map Reduce > ---------------------------------------------------------------- > > Key: HADOOP-1986 > URL: https://issues.apache.org/jira/browse/HADOOP-1986 > Project: Hadoop > Issue Type: New Feature > Components: mapred > Reporter: Tom White > Fix For: 0.16.0 > > Attachments: SerializableWritable.java > > > Currently Map Reduce programs have to use WritableComparable-Writable > key-value pairs. While it's possible to write Writable wrappers for other > serialization frameworks (such as Thrift), this is not very convenient: it > would be nicer to be able to use arbitrary types directly, without explicit > wrapping and unwrapping. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.