[ 
https://issues.apache.org/jira/browse/HDFS-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10201:
---------------------------------
    Description: 
According to the current design doc for hflush support in erasure coding (see 
[HDFS-7661]), the parity datanode (DN) needs an undo log for flush operations. 
After hflush/hsync, the last cell will be overwritten when 1) the current strip 
is full, 2) the file is closed, 3) or the hflush/hsync is called again for the 
current non-full stripe. To serve new reader client and to tolerate failures 
between successful hflush/hsync and overwrite operation, the parity DN should 
preserve the old cell in the undo log before overwriting it.

As parities correspond to block group (BG) length and parity data of different 
BG length may have the same block length, the undo log should also save the 
respective block group (BG) length information for the flushed data.

This jira is to track the effort of designing and implementing an undo log in 
parity DN to support hflush/hsync operations.

  was:
According to the current design doc for hflush support in erasure coding (see 
[HDFS-7661]), the parity datanode (DN) needs an undo log for flush operations. 
After hflush/hsync, the last cell will be overwritten when 1) the current strip 
is full, 2) the file is closed, 3) or the hflush/hsync is called again for the 
current non-full stripe. To serve new reader client and to tolerate failures 
between successful hflush/hsync and overwrite operation, the parity DN should 
preserve the old cell in the undo log before overwriting it.

As parities correspond to BG length and parity data of different BG length may 
have the same block length, the undo log should also save the respective block 
group (BG) length information for the flushed data.

This jira is to track the effort of designing and implementing an undo log in 
parity DN to support hflush/hsync operations.


> Implement undo log in parity datanode for hflush operations
> -----------------------------------------------------------
>
>                 Key: HDFS-10201
>                 URL: https://issues.apache.org/jira/browse/HDFS-10201
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>            Reporter: Mingliang Liu
>            Assignee: Mingliang Liu
>
> According to the current design doc for hflush support in erasure coding (see 
> [HDFS-7661]), the parity datanode (DN) needs an undo log for flush 
> operations. After hflush/hsync, the last cell will be overwritten when 1) the 
> current strip is full, 2) the file is closed, 3) or the hflush/hsync is 
> called again for the current non-full stripe. To serve new reader client and 
> to tolerate failures between successful hflush/hsync and overwrite operation, 
> the parity DN should preserve the old cell in the undo log before overwriting 
> it.
> As parities correspond to block group (BG) length and parity data of 
> different BG length may have the same block length, the undo log should also 
> save the respective block group (BG) length information for the flushed data.
> This jira is to track the effort of designing and implementing an undo log in 
> parity DN to support hflush/hsync operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to