[jira] Updated: (HDFS-1606) Provide a stronger data guarantee in the write pipeline

Tsz Wo (Nicholas), SZE (JIRA) Mon, 07 Feb 2011 00:40:00 -0800

     [ 
https://issues.apache.org/jira/browse/HDFS-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tsz Wo (Nicholas), SZE updated HDFS-1606:
-----------------------------------------

    Description: 
In the current design, if there is a datanode/network failure in the write 
pipeline, DFSClient will try to remove the failed datanode from the pipeline 
and then continue writing with the remaining datanodes.  As a result, the 
number of datanodes in the pipeline is decreased.  Unfortunately, it is 
possible that DFSClient may incorrectly remove a healthy datanode but leave the 
failed datanode in the pipeline because failure detection may be inaccurate 
under erroneous conditions.

We propose to have a new mechanism for adding new datanodes to the pipeline in 
order to provide a stronger data guarantee.

  was:
In the current design, if there is a datanode/network failures in the write 
pipeline, DFSClient will try to remove the failed datanode from the pipeline 
and then continue writing with the remaining datanodes.  As a result, the 
number of datanodes in the pipeline is decreased.  Unfortunately, it is 
possible that DFSClient may incorrectly remove a healthy datanode but leave the 
failed datanode in the pipeline because failure detection may not be accurate 
under erroneous conditions.

We propose to add a new mechanism for adding new datanodes to the pipeline in 
order to provide a stronger data guarantee.

       Assignee: Tsz Wo (Nicholas), SZE

Below are two important use cases:

- h5. Long Living Pipeline (e.g. HBase logging)
When a pipeline is short living, the failure probability may be negligible.  
However, when the client writes very slowly, the failure probability becomes 
significant.

- h5. File Append
When a new file is being written, if all the datanodes in a pipeline fail, then 
the data written will be lost.  Although the behavior is not ideal, it is 
acceptable since DFSClient will fail to close the file and we allow data loss 
in a never-closed file.  Nevertheless, when a closed file is reopened for 
append, the last block _B_ of the file is reopened and a pipeline is re-created 
(provided that the pre-append file size is not a multiple of the block size.)  
_B_ will not be selected for replication until the pipeline is finished.  Then, 
the pre-append data stored in _B_ may be lost if all the datanodes in the 
pipeline fail and the subsequent block recovery fails.  Such behavior is 
unacceptable since the pre-append data was stored in a closed file.


> Provide a stronger data guarantee in the write pipeline
> -------------------------------------------------------
>
>                 Key: HDFS-1606
>                 URL: https://issues.apache.org/jira/browse/HDFS-1606
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, hdfs client
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>
> In the current design, if there is a datanode/network failure in the write 
> pipeline, DFSClient will try to remove the failed datanode from the pipeline 
> and then continue writing with the remaining datanodes.  As a result, the 
> number of datanodes in the pipeline is decreased.  Unfortunately, it is 
> possible that DFSClient may incorrectly remove a healthy datanode but leave 
> the failed datanode in the pipeline because failure detection may be 
> inaccurate under erroneous conditions.
> We propose to have a new mechanism for adding new datanodes to the pipeline 
> in order to provide a stronger data guarantee.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HDFS-1606) Provide a stronger data guarantee in the write pipeline

Reply via email to