[ 
https://issues.apache.org/jira/browse/HDFS-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629915#comment-13629915
 ] 

Nicolas Liochon commented on HDFS-4689:
---------------------------------------

You're sure it's not HBASE-7878 / HBASE-8204 (i.e. hbase now waits for the 
lease to be recoved)?
If you can reproduce the problem "on demand", you may want to backport the fix 
(it should be a simple copy paste of the file) to try.

If you have a new dataloss scenario in HBase, you should create an HBase jira, 
as it's pretty critical and everyone should know. (or link this one to it if 
the jira is already created). Thanks!
                
> freeze/seal a hdfs file
> -----------------------
>
>                 Key: HDFS-4689
>                 URL: https://issues.apache.org/jira/browse/HDFS-4689
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, hdfs-client, namenode
>    Affects Versions: 2.0.0-alpha
>            Reporter: Zesheng Wu
>              Labels: freeze, seal
>
> I would like to describe the problem scenario at first, that is in our hbase 
> cluster:
> 1. rs1 loses its zookeeper lock, and hmaster realizes that
> 2. hmaster assigns the regions of rs1 to rs2
> 3. rs2 renames the hlog of rs1, and begins to replay the log
> 4. but at the meantime, rs1 is still running, and the client still writes 
> data to rs1
> 5. in this scenario, the data written after rs2 renamed rs1's hlog will be 
> lost
> The root cause of the problem is: 
> As we all know, when we open a hdfs file for write, the file meta is only 
> updated when a block is finished or when the file is closed. But the client 
> thinks that the data is successfully written when it receives ack from 
> datanode. Under this premise, after a file is renamed, the client is not 
> required to update the meta immediately, so the client will not realize about 
> the renaming, and will keep writing to the block, and will write successfully 
> until the block is finished or the file is closed. The data written during 
> this time will certainly be lost.
> The basic idea about how to solve this is to add a freeze/seal semantics for 
> a file, when a file is frozen/sealed, the client can't write any data to it, 
> but it can be renamed or deleted.
> If we can freeze/seal a file, the scenario at the beginning will like this:
> 1. rs1 loses its zookeeper lock, and hmaster realizes that
> 2. hmaster freezes/seals the hlog of rs1
> 3. hmaster assigns the regions of rs1 to rs2
> 4. rs2 renames the hlog of rs1, and begins to replay the log
> 5. after rs2 successfully replayed the log, the log file is deleted
> 6. in this scenario, after hmaster freezed/sealed the hlog file of rs1, rs1 
> can't write any data to it even if it is still running, this can guarantee no 
> data will be lost
> I hope I've described the problem clearly. Is there anyone has already worked 
> on this feature? And any idea about this will be very appreciated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to