[jira] [Commented] (HBASE-13877) Interrupt to flush from TableFlushProcedure causes dataloss in ITBLL

stack (JIRA) Wed, 10 Jun 2015 22:15:19 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581442#comment-14581442
 ]


stack commented on HBASE-13877:
-------------------------------

[~enis] any comment on [~Apache9] remark?

Are dealing w/ above, I'm +1 on commit. I did not find incidence of the 
original issue in my run after looking in all logs. In my case, I am seeing 
double-assignment over a master restart.

2015-06-09 20:06:20,839 WARN  
[c2020.halxg.cloudera.com,16000,1433905568816-GeneralBulkAssigner-1] 
master.AssignmentManager: Assigning a region not in region states: {ENCODED => 
6fbe22ff15c2e5f2b207f79eaf8f382a, NAME => 
'IntegrationTestBigLinkedList,\xEB\x85\x1E\xB8Q\xEB\x85\x10,1433895189133.6fbe22ff15c2e5f2b207f79eaf8f382a.',
 STARTKEY => '\xEB\x85\x1E\xB8Q\xEB\x85\x10', ENDKEY => 
'\xF5\xC2\x8F\x5C(\xF5\xC2\x80'}

Will open new 'My Struggle' issue when I have figured more why double-assign 
and then in turn, why dataloss (I don't see how at mo -- will keep digging).

> Interrupt to flush from TableFlushProcedure causes dataloss in ITBLL
> --------------------------------------------------------------------
>
>                 Key: HBASE-13877
>                 URL: https://issues.apache.org/jira/browse/HBASE-13877
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>            Priority: Blocker
>             Fix For: 2.0.0, 1.2.0, 1.1.1
>
>         Attachments: hbase-13877_v1.patch, hbase-13877_v2-branch-1.1.patch
>
>
> ITBLL with 1.25B rows failed for me (and Stack as reported in 
> https://issues.apache.org/jira/browse/HBASE-13811?focusedCommentId=14577834&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14577834)
>  
> HBASE-13811 and HBASE-13853 fixed an issue with WAL edit filtering. 
> The root cause this time seems to be different. It is due to procedure based 
> flush interrupting the flush request in case the procedure is cancelled from 
> an exception elsewhere. This leaves the memstore snapshot intact without 
> aborting the server. The next flush, then flushes the previous memstore with 
> the current seqId (as opposed to seqId from the memstore snapshot). This 
> creates an hfile with larger seqId than what its contents are. Previous 
> behavior in 0.98 and 1.0 (I believe) is that after flush prepare and 
> interruption / exception will cause RS abort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13877) Interrupt to flush from TableFlushProcedure causes dataloss in ITBLL

Reply via email to