[
https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035560#comment-15035560
]
GAO Rui commented on HDFS-7661:
-------------------------------
Sorry, I have some misunderstanding before. Now, I see that if we have a
protocol to overwrite the last cell of each parity block, the problem becomes
how to keep the data consistency.
How about using a lock for the parity block files of a being written file in
datanode side ?
In writing side, for flush/sync/write (all the actions may lead to regenerate
parity cells), we could append data cells without possessing lock. But, before
sending overwriting protocol messages, we try to possess the lock. If the
parity block files is being read, the flush/sync/write blocked until the read
operation finished. After got the lock, we overwrite parity block files for the
last few bytes. Then release the lock.
In reading side, we also need to get the lock for reading, and release the lock
after reading.
I think this could guarantee the data consistency. So, the problem becomes how
to implement the lock which could lock and unlock all the parity block files at
the same time. I think we could set a queue for every parity block file lock.
And set the rule that, a client could only get the lock if it had been adjusted
to the head of all the parity block files lock queues.
And another concern is that if we lock the whole parity block file while doing
flush/sync/write, the reader could not read the previous parity cells in this
parity block file. It becomes a performance trade-off for data consistency.
> Support read when a EC file is being written
> --------------------------------------------
>
> Key: HDFS-7661
> URL: https://issues.apache.org/jira/browse/HDFS-7661
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Tsz Wo Nicholas Sze
> Assignee: GAO Rui
> Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png,
> HDFS-7661-unitTest-wip-trunk.patch
>
>
> We also need to support hflush/hsync and visible length.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)