[ 
https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6984:
---------------------------------------
    Attachment: HDFS-6984.002.patch

I guess making it no longer Writable is probably too big of a change.  DistCp 
and other programs make use of the fact that they can write out and later read 
back FileStatus objects.  However, it is really unpleasant that we can't add 
new fields to the serialized representation of FileStatus.

Here's a new version that fixes this dilemma by changing the serialization 
format to be Protobuf for FileStatus objects.  This will let us add new fields 
to FileStatus in the future.  I think this change makes sense for Hadoop 3 
rather than Hadoop 2, since it is incompatible with the previous FileStatus 
serialization.

> In Hadoop 3, make FileStatus no longer a Writable
> -------------------------------------------------
>
>                 Key: HDFS-6984
>                 URL: https://issues.apache.org/jira/browse/HDFS-6984
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>              Labels: BB2015-05-TBR
>         Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch
>
>
> FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this 
> to serialize it and send it over the wire.  But in Hadoop 2 and later, we 
> have the protobuf {{HdfsFileStatusProto}} which serves to serialize this 
> information.  The protobuf form is preferable, since it allows us to add new 
> fields in a backwards-compatible way.  Another issue is that already a lot of 
> subclasses of FileStatus don't override the Writable methods of the 
> superclass, breaking the interface contract that read(status.write) should be 
> equal to the original status.
> In Hadoop 3, we should just make FileStatus no longer a writable so that we 
> don't have to deal with these issues.  It's probably too late to do this in 
> Hadoop 2, since user code may be relying on the ability to use the Writable 
> methods on FileStatus objects there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to