[ 
https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157533#comment-15157533
 ] 

Mingliang Liu commented on HDFS-7661:
-------------------------------------

Hi [~demongaorui],

Yes your example makes sense to me. If the first flush on a non-full stripe 
succeeds and the second flush on the same non-full stripe fails, the data may 
be lost by the current design, which states that
{quote}
Data may be corrupt/lost if hflush/hsync fails.
{quote}

1. It happens only when there are exactly {{numDataBlocks}} of {{numDataBlocks 
+ numParityBlocks}} healthy internal blocks. If we have more healthy internal 
blocks, the data could be read/reconstructed without problems.
2. I don't know if this is a hard requirement, but we need to define the 
semantic clearly. Before that, consider the current hflush for non-EC block: if 
there is a failing second hflush on the last partial chunk, will we lose its 
checksum? The assumption is that if the end of the on-disk data is not 
chunk-aligned, the last checksum needs to be overwritten.
3. Last, I think we still have a chance to remedy this on top of current 
design, though I don't have detailed plan for this by now. Feel free to update 
the design doc if you have any idea.

> Erasure coding: support hflush and hsync
> ----------------------------------------
>
>                 Key: HDFS-7661
>                 URL: https://issues.apache.org/jira/browse/HDFS-7661
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: GAO Rui
>         Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, 
> HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, 
> HDFS-EC-file-flush-sync-design-version1.1.pdf, 
> HDFS-EC-file-flush-sync-design-version2.0.pdf
>
>
> We also need to support hflush/hsync and visible length. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to