[ 
https://issues.apache.org/jira/browse/HBASE-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingcheng Du updated HBASE-16841:
---------------------------------
    Attachment: HBASE-16841-V2.patch

Upload a new patch to fix the failures in tests about restoring snapshots.
Hi [~tedyu], it is a little difficult to add a test for this case, it needs 
some delayed flush in some regions during the snapshot which is hard to mimic. 
Mob doesn't allow a configurable flusher which is designed by purpose to reduce 
the configurations when using mob.
I think this is the issue that caused the failures sometimes in unit tests. 
Hope this patch can fix them all.
Hi [~mbertozzi], do you want to look at this patch? Thanks!

> Data loss in MOB files after cloning a snapshot and deleting that snapshot
> --------------------------------------------------------------------------
>
>                 Key: HBASE-16841
>                 URL: https://issues.apache.org/jira/browse/HBASE-16841
>             Project: HBase
>          Issue Type: Bug
>          Components: mob, snapshots
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>         Attachments: HBASE-16841-V2.patch, HBASE-16841.patch
>
>
> Running the following steps will probably lose MOB data when working with 
> snapshots.
> 1. Create a mob-enabled table by running create 't1', {NAME => 'f1', IS_MOB 
> => true, MOB_THRESHOLD => 0}.
> 2. Put millions of data.
> 3. Run {{snapshot 't1','t1_snapshot'}} to take a snapshot for this table t1.
> 4. Run {{clone_snapshot 't1_snapshot','t1_cloned'}} to clone this snapshot.
> 5. Run {{delete_snapshot 't1_snapshot'}} to delete this snapshot.
> 6. Run {{disable 't1'}} and {{delete 't1'}} to delete the table.
> 7. Now go to the archive directory of t1, the number of .link directories is 
> different from the number of hfiles which means some data will be lost after 
> the hfile cleaner runs.
> This is because, when taking a snapshot on a enabled mob table, each region 
> flushes itself and takes a snapshot, and the mob snapshot is taken only if 
> the current region is first region of the table. At that time, the flushing 
> of some regions might not be finished, and some mob files are not flushed to 
> disk yet. Eventually some mob files are not recorded in the snapshot manifest.
> To solve this, we need to take the mob snapshot at last after the snapshots 
> on all the online and offline regions are finished in 
> {{EnabledTableSnapshotHandler}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to