[ 
https://issues.apache.org/jira/browse/HDFS-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091258#comment-13091258
 ] 

Todd Lipcon commented on HDFS-2288:
-----------------------------------

One simple scenario is:
- Client is writing a file with replication 0
- Client calls fsync
- the DN crashes, so the pipeline fails
- DN comes back up, and a reader tries to read the file before lease recovery 
has occurred

I think it's correct to return the full number of bytes on the disk, reasoning 
being:
- If just this DN crashed, the pipeline would have recovered before any more 
data was written, and the longer replicas would have a newer generation stamp 
stored in the NN. If the local DN has an older genstamp than what the client 
requests, it will throw an IOE in FSDataset.getReplicaVisibleLength
- If all of the DNs crashed at the same time, and the client didn't update the 
generation stamp, then the replicas may have different lengths. But, all of the 
replicas are at least as long as the last successful fsync, which is the only 
guarantee we have to provide.

Thoughts?

> Replicas awaiting recovery should return a full visible length
> --------------------------------------------------------------
>
>                 Key: HDFS-2288
>                 URL: https://issues.apache.org/jira/browse/HDFS-2288
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>
> Currently, if the client calls getReplicaVisibleLength for a RWR, it returns 
> a visible length of 0. This causes one of HBase's tests to fail, and I 
> believe it's incorrect behavior.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to