[ 
https://issues.apache.org/jira/browse/HBASE-16649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15524882#comment-15524882
 ] 

Hudson commented on HBASE-16649:
--------------------------------

FAILURE: Integrated in Jenkins build HBase-1.4 #432 (See 
[https://builds.apache.org/job/HBase-1.4/432/])
HBASE-16649 Truncate table with splits preserved can cause both data 
(matteo.bertozzi: rev 4566e4df58bdd176228aab2cd3cfd80dd983072f)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestTruncateTableProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteTableProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java


> Truncate table with splits preserved can cause both data loss and truncated 
> data appeared again
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-16649
>                 URL: https://issues.apache.org/jira/browse/HBASE-16649
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.1.3
>            Reporter: Allan Yang
>            Assignee: Matteo Bertozzi
>             Fix For: 2.0.0, 1.3.0, 1.1.7, 0.98.23, 1.2.4
>
>         Attachments: HBASE-16649-v0.patch, HBASE-16649-v1.patch, 
> HBASE-16649-v2.patch
>
>
> Since truncate table with splits preserved will delete hfiles and use the 
> previous regioninfo. It can cause odd behaviors
> - Case 1: *Data appeared after truncate*
> reproduce procedureļ¼š
> 1. create a table, let's say 'test'
> 2. write data to 'test', make sure memstore of 'test' is not empty
> 3. truncate 'test' with splits preserved
> 4. kill the regionserver hosting the region(s) of 'test'
> 5. start the regionserver, now it is the time to witness the miracle! the 
> truncated data appeared in table 'test'
> - Case 2: *Data loss*
> reproduce procedure:
> 1. create a table, let's say 'test'
> 2. write some data to 'test', no matter how many
> 3. truncate 'test' with splits preserved
> 4. restart the regionserver to reset the seqid
> 5. write some data, but less than 2 since we don't want the seqid to run over 
> the one in 2
> 6. kill the regionserver hosting the region(s) of 'test'
> 7. restart the regionserver. Congratulations! the data writen in 4 is now all 
> lost
> *Why?*
> for case 1
> Since preserve splits in truncate table procedure will not change the 
> regioninfo, when log replay happens, the 'unflushed' data will restore back 
> to the region
> for case 2
> since the flushedSequenceIdByRegion are stored in Master in a map with the 
> region's encodedName. Although the table is truncated, the region's name is 
> not changed since we chose to preserve the splits. So after truncate the 
> table, the region's sequenceid is reset in the regionserver, but not reset in 
> master. When flush comes and report to master, master will reject the update 
> of sequenceid since the new one is smaller than the old one. The same happens 
> in log replay, all the edits writen in 4 will be skipped since they have a 
> smaller seqid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to