[jira] [Commented] (HBASE-11218) Data loss in HBase standalone mode

Liu Shaohui (JIRA) Fri, 23 May 2014 01:50:18 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14006980#comment-14006980
 ]


Liu Shaohui commented on HBASE-11218:
-------------------------------------

[~nkeywal]
{quote}
I'm curious, why is it an issue for you?
{quote}
We use the standalone hbase in our onebox test env. In this env, we start the 
application and other services (eg, hbase, zookeeper) that it depends on in a 
machine, , then do many end-to-end logical and failure tests. So it is  
important  to keep data durability in some level. Data loss in standalone mode 
give a not good impression of HBase to out users.

{quote}
On the patch, will this slow down the tests?
{quote}
I will do a comparison and show the data later.


> Data loss in HBase standalone mode
> ----------------------------------
>
>                 Key: HBASE-11218
>                 URL: https://issues.apache.org/jira/browse/HBASE-11218
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Liu Shaohui
>            Assignee: Liu Shaohui
>             Fix For: 0.99.0
>
>         Attachments: HBASE-11218-trunk-v1.diff
>
>
> Data loss in HBase standalone mode.
> *How to produce it*
> # Start HBase standalone mode.
> # Create a table using hbase shell.
> # Scan '.META.' and you will find data in meta table
> # Kill the HBase process with -9 option
> # Start the HBase agaion
> # Scan '.META.' and you will find nothing in meta table.
> *There are three main reasons.*
> # FSDataOutputStream.sync should call flush() if the underlying wrapped 
> stream is not Syncable. See HADOOP-8861
> # writeChecksum is ture in  default LocalFileSystem and the 
> ChecksumFSOutputSummer will buffer the data, which make the waledits are not 
> written to os's filesystem with sync method immediately, and those edits will 
> be lost in regionserver's failover.
> #  The MiniZooKeeperCluster deletes the old zk data at startup which maye 
> cause data loss in meta table. The failover procedure is: split pre root 
> regionserver's hlog -> assign root -> split pre meta regionserver's hlog -> 
> assign meta -> split all other regionservers' hlogs -> assign other regions. 
> If there is no data in zookeeper, we will get null for root regionserver and 
> then assign root table. Some data in root table maybe be lost for some root's 
> WalEdits have not been splited and replayed. So does the Meta table.
> I finished the patch for 0.94 and am working on the patch for trunk. 
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11218) Data loss in HBase standalone mode

Reply via email to