[ 
https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035560#comment-15035560
 ] 

GAO Rui commented on HDFS-7661:
-------------------------------

Sorry, I have some misunderstanding before.  Now, I see that if we have a 
protocol to overwrite the last cell of each parity block, the problem becomes 
how to keep the data consistency.  

How about using a lock for the parity block files of a being written file in 
datanode side ? 

In writing side, for flush/sync/write (all the actions may lead to regenerate 
parity cells), we could append data cells without possessing lock. But, before 
sending overwriting protocol messages, we try to possess the lock. If the 
parity block files is being read, the flush/sync/write blocked until the read 
operation finished. After got the lock, we overwrite parity block files for the 
last few bytes. Then release the lock.

In reading side, we also need to get the lock for reading, and release the lock 
after reading. 

I think this could guarantee the data consistency. So, the problem becomes how 
to implement the lock which could lock and unlock all the parity block files at 
the same time. I think we could set a queue for every parity block file lock. 
And set the rule that, a client could only get the lock if it had been adjusted 
to the head of all the parity block files lock queues. 

And another concern is that if we lock the whole parity block file while doing 
flush/sync/write, the reader could not read the previous parity cells in this 
parity block file. It becomes a performance trade-off for data consistency.

> Support read when a EC file is being written
> --------------------------------------------
>
>                 Key: HDFS-7661
>                 URL: https://issues.apache.org/jira/browse/HDFS-7661
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: GAO Rui
>         Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, 
> HDFS-7661-unitTest-wip-trunk.patch
>
>
> We also need to support hflush/hsync and visible length. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to