[jira] [Commented] (HBASE-7671) Flushing memstore again after last failure could cause data loss

Lars Hofhansl (JIRA) Sun, 27 Jan 2013 22:25:16 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564082#comment-13564082
 ]


Lars Hofhansl commented on HBASE-7671:
--------------------------------------

Thanks Chunhui.

yes, we'd store the regions sequence at the time of the snapshot together with 
every snapshot (we would not need to maintain a separate sequenceId, but just 
remember it with the snapshot).

On the other hand, it would entangle concepts (store with region), and copying 
new KVs over into the existing snapshot avoids that.

                
> Flushing memstore again after last failure could cause data loss
> ----------------------------------------------------------------
>
>                 Key: HBASE-7671
>                 URL: https://issues.apache.org/jira/browse/HBASE-7671
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.4
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.96.0, 0.94.5
>
>         Attachments: HBASE-7671.patch, HBASE-7671v2.patch, HBASE-7671v3.patch
>
>
> See the following logs first:
> {code}
> 2013-01-23 18:58:38,801 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Flushed , sequenceid=9746535080, memsize=101.8m, into tmp file 
> hdfs://dw77.kgb.sqa.cm4:9900/hbase-test3/writetest1/8dc14e35b4d7c0e481e0bb30849cff7d/.tmp/bebeeecc56364b6c8126cf1dc6782a25
> 2013-01-23 18:58:41,982 WARN org.apache.hadoop.hbase.regionserver.MemStore: 
> Snapshot called again without clearing previous. Doing nothing. Another 
> ongoing flush or did we fail last attempt?
> 2013-01-23 18:58:43,274 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Flushed , sequenceid=9746599334, memsize=101.8m, into tmp file 
> hdfs://dw77.kgb.sqa.cm4:9900/hbase-test3/writetest1/8dc14e35b4d7c0e481e0bb30849cff7d/.tmp/4eede32dc469480bb3d469aaff332313
> {code}
> The first time memstore flush is failed when commitFile()(Logged the first 
> edit above), then trigger server abort, but another flush is coming 
> immediately(could caused by move/split,Logged the third edit above) and 
> successful.
> For the same memstore's snapshot, we get different sequenceid, it causes data 
> loss when replaying log edits
> See details from the unit test case in the patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7671) Flushing memstore again after last failure could cause data loss

Reply via email to