[ 
https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217030#comment-15217030
 ] 

Zhe Zhang commented on HDFS-7661:
---------------------------------

Thanks for the discussions [~demongaorui], [~liuml07], [~szetszwo].

I read through the design doc and agree that overwriting the parity blocks is 
very complex. So here's an alternative thought:
# On the high level, we don't create temporary parity blocks when {{hflush}} is 
called. Instead we can send the actual data cells to the "parity DNs".
# On the client write path, {{DFSStripedOutputStream#cellBuffers}} keeps all 
data cells before the stripe is full. So when {{hflush}} is called, client can 
transfer all {{cellBuffers}} to all parity DNs. Yes this will cause some 
additional data transfers. But the cell size is only 64KB.
# On the "parity DN", we can create special files (details to be discussed), 
each for a temporary data cell. These special files will be appended to for 
future {{hflush}} operations. Parity blocks will be operated *without any 
overwriting*.
# Client read logic needs to be extended to read special "data cell" files when 
needed. I think that means the length of the parity block is shorter than 
expected (calculated from the length of the logical block group). 
Alternatively, "parity DN" can locally apply the "data cell" files through 
encoding, and transfer the longer version of parity block to client reader.

> [umbrella] support hflush and hsync for erasure coded files
> -----------------------------------------------------------
>
>                 Key: HDFS-7661
>                 URL: https://issues.apache.org/jira/browse/HDFS-7661
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: erasure-coding
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: GAO Rui
>         Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, 
> HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, 
> HDFS-EC-file-flush-sync-design-v20160323.pdf, 
> HDFS-EC-file-flush-sync-design-version1.1.pdf
>
>
> We also need to support hflush/hsync and visible length. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to