[jira] Commented: (HADOOP-6685) Change the generic serialization framework API to use serialization-specific bytes instead of Map for configuration

Doug Cutting (JIRA) Fri, 19 Nov 2010 14:32:50 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934030#action_12934030
 ]


Doug Cutting commented on HADOOP-6685:
--------------------------------------

> There is petabytes of data in SequenceFile format in Hadoop clusters 
> everywhere. We cannot drop it, we need to maintain it and keep it up to date. 
> We also need to improve to continue to support existing users.

I have never proposed dropping SequenceFile.  I have proposed that we not 
extend it.  I have proposed that if we introduce a new concrete binary object 
data file format (container+serialization) then we should only introduce a 
single such second-generation format.  If we cannot agree on such a format, 
then we will be stuck adding no new formats to the kernel but rather creating 
new formats in external projects.


> Change the generic serialization framework API to use serialization-specific 
> bytes instead of Map<String,String> for configuration
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-6685
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6685
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.22.0
>
>         Attachments: libthrift.jar, serial.patch, serial4.patch, 
> serial6.patch, serial7.patch, SerializationAtSummit.pdf
>
>
> Currently, the generic serialization framework uses Map<String,String> for 
> the serialization specific configuration. Since this data is really internal 
> to the specific serialization, I think we should change it to be an opaque 
> binary blob. This will simplify the interface for defining specific 
> serializations for different contexts (MAPREDUCE-1462). It will also move us 
> toward having serialized objects for Mappers, Reducers, etc (MAPREDUCE-1183).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-6685) Change the generic serialization framework API to use serialization-specific bytes instead of Map for configuration

Reply via email to