[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526933#comment-16526933
 ] 

Francis Liu commented on HBASE-20704:
-------------------------------------

Hmm actually even if at add the list of parent storefiles in there's still a 
corner case during regionserver failover that we won't cover (ie a a dead RS 
commits a compacted storefile before aborting and the region has already been 
opened elsewhere). It seems the more straightforward and intuitive way to solve 
this is the current proposed way of closing and cleaning up the compacted 
storefiles on close. And make sure the still relevant compaction markers are 
replayed on region for HBASE-20724. Let me go down this route and see how that 
goes. 

Long term tho I think the better solution is to keep the list of active (and 
possibly compacted) storefiles in meta. That way we can update the changes 
atomically. Tho that will probably only be a viable option once splitting meta 
is available.

> Sometimes some compacted storefiles are not archived on region close
> --------------------------------------------------------------------
>
>                 Key: HBASE-20704
>                 URL: https://issues.apache.org/jira/browse/HBASE-20704
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>    Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>            Reporter: Francis Liu
>            Assignee: Francis Liu
>            Priority: Critical
>         Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to