[ 
https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177055#comment-15177055
 ] 

Mingliang Liu commented on HDFS-7661:
-------------------------------------

Thanks for the comments, [~sinall]!

1. Let's see if I get your example correctly. In your example I think the 
reader is expected to read the data by V1 length as it issued the read request 
initially at that time. As we explained in the design, the reader will collect 
visible BG length from all parity DNs, and calculate the maximum BG length to 
read. After this, the reader client will then issue read request to parity DNs 
provided the expected BG length. Firstly, the reader client is able to detect 
conflicting BG length by our design. Secondly, the reader client will retry for 
conflicting BG lengths and it will eventually get the same version from all the 
parity DNs, say V9. Last, given the expected BG length, if the parity DN finds 
that the requested data was already overwritten, it will throw exception to the 
reader client, which will also retry to catch up the latest flushed data. The 
worst case for retry is that, reader client retries until the strip is full 
when all the parity DNs have the same data. We may discuss the algorithms later 
but I believe we are able to handle this case in the current design.
2. I think the lock mechanism from client side is even more complex than the 
lock in NN. The read/write client may fail/close at any time, and thus 
coordinating a read/writer lock by NN in this way is extremely hard, if 
possible. Most importantly, the writer should not be blocked by readers for the 
sake of data consistency. This needs IMHO fundamental change.

The semantic is to make successful flushed data visible to new readers, and in 
case of failing {{hflush}}, the reader should be able to reconstruct the block 
group by last successful flushed length. For more details, please refer to the 
revised design doc.


> Erasure coding: support hflush and hsync
> ----------------------------------------
>
>                 Key: HDFS-7661
>                 URL: https://issues.apache.org/jira/browse/HDFS-7661
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: GAO Rui
>         Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, 
> HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, 
> HDFS-EC-file-flush-sync-design-version1.1.pdf, 
> HDFS-EC-file-flush-sync-design-version2.0.pdf, 
> HDFS-EC-file-flush-sync-design-version2.1.pdf
>
>
> We also need to support hflush/hsync and visible length. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to