[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882070#comment-16882070
 ] 

Sean Busbey commented on HBASE-22075:
-------------------------------------

bq. As for HBASE-16812, I do not think it is the only patch which affected MOB 
in a bad way - there should be others. I am saying that, because my own test 
failed with HBASE-16812 reverted (on HDP-2.6.5).

yeah, I agree here. I finished backporting HBASE-16812 onto CDH5.13.3 last 
night and the IT still shows no dataloss.

bq. To prevent non-atomic failures we will need acid txs? No?

To prevent cross-region non-atomic failures yes. But I don't think we need to 
prevent that; we just need to update the logic of committing the updated refs 
to handle non-atomic failure. the bulkload code assumes someone will work it 
out handling the failure externally and then we don't. I think instead we 
should use the per-region atomic commit of bulk loaded files to track when 
we've successfully made use of our newly compacted files. if we can't succeed 
on retry for a given region we can just keep around the old mob files for 
references in that region and then clean things up on the next mob compaction.

> Potential data loss when MOB compaction fails
> ---------------------------------------------
>
>                 Key: HBASE-22075
>                 URL: https://issues.apache.org/jira/browse/HBASE-22075
>             Project: HBase
>          Issue Type: Bug
>          Components: mob
>    Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>            Priority: Critical
>              Labels: compaction, mob
>             Fix For: 2.0.6, 2.2.1, 2.1.6
>
>         Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to