[
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799031#comment-16799031
]
Josh Elser commented on HBASE-22075:
------------------------------------
{quote}Forced splits during bulkload due to changed region boundaries,
[~elserj].
{quote}
Ah, the import of the ref file itself being split (even though it has one
entry). I'm with you now. I'll have to look at how we find MOB files work on
split (there must be some logic for the daughter regions to find the MOB files
from the parent region, right?).
So, we end up trying to bulk-load a ref file into two regions (after a split),
one of them succeeds and one of them doesn't. Thus, that new ref file
overwrites the old refs, and if we clean up the MOB files, we lose data. Ok, I
can see that now :)
Is this something that we can show in a unit test? I feel like that would go a
long way to help prevent data loss in the future.
> Potential data loss when MOB compaction fails
> ---------------------------------------------
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
> Issue Type: Bug
> Components: mob
> Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4,
> 2.1.3
> Reporter: Vladimir Rodionov
> Assignee: Vladimir Rodionov
> Priority: Critical
> Labels: mob
> Fix For: 2.2.0, 2.0.5, 2.1.4
>
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created
> reference file) there is a high chance of a data loss due to partially loaded
> reference file, cells of which refer to (now) non-existent MOB file. The
> newly created MOB file is deleted automatically in case of a MOB compaction
> failure, but some cells with the references to this file might be loaded to
> HBase.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)