[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217030#comment-15217030 ]
Zhe Zhang commented on HDFS-7661: --------------------------------- Thanks for the discussions [~demongaorui], [~liuml07], [~szetszwo]. I read through the design doc and agree that overwriting the parity blocks is very complex. So here's an alternative thought: # On the high level, we don't create temporary parity blocks when {{hflush}} is called. Instead we can send the actual data cells to the "parity DNs". # On the client write path, {{DFSStripedOutputStream#cellBuffers}} keeps all data cells before the stripe is full. So when {{hflush}} is called, client can transfer all {{cellBuffers}} to all parity DNs. Yes this will cause some additional data transfers. But the cell size is only 64KB. # On the "parity DN", we can create special files (details to be discussed), each for a temporary data cell. These special files will be appended to for future {{hflush}} operations. Parity blocks will be operated *without any overwriting*. # Client read logic needs to be extended to read special "data cell" files when needed. I think that means the length of the parity block is shorter than expected (calculated from the length of the logical block group). Alternatively, "parity DN" can locally apply the "data cell" files through encoding, and transfer the longer version of parity block to client reader. > [umbrella] support hflush and hsync for erasure coded files > ----------------------------------------------------------- > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: New Feature > Components: erasure-coding > Reporter: Tsz Wo Nicholas Sze > Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-v20160323.pdf, > HDFS-EC-file-flush-sync-design-version1.1.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)