[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880489#comment-16880489
 ] 

Sean Busbey commented on HBASE-22075:
-------------------------------------

My current opinion is that there are a couple of different issues to solve here.

1) I found that all of the places we see this particular dataloss test show a 
problem include HBASE-16812. Before that change there's a lock preventing 
overlaps between compaction and mob compaction. CDH5's backport of the MOB 
feature does not include this change.

Since that change is too far back to easily revert in master or branches-2 I'm 
going to test this theory by backporting it on top of CDH5 and see if the IT 
then shows the dataloss. will report back.

2) Independent of the problem with races between compaction and mob compaction, 
I think the use of bulk load to commit the updated ref files is subject to 
non-atomic failure. We should either confirm that it isn't or rework how we 
commit the updated mob references. My intuition is that we should be able to do 
this region-by-region using the building blocks that bulk loading is based on 
without needing to completely overhaul mob accounting or mob compaction (e.g. 
we shouldn't need something like the distributed procedure based mob compaction 
from HBASE-15381)

> Potential data loss when MOB compaction fails
> ---------------------------------------------
>
>                 Key: HBASE-22075
>                 URL: https://issues.apache.org/jira/browse/HBASE-22075
>             Project: HBase
>          Issue Type: Bug
>          Components: mob
>    Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>            Priority: Critical
>              Labels: compaction, mob
>             Fix For: 2.0.6, 2.2.1, 2.1.6
>
>         Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to