[
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16619988#comment-16619988
]
Francis Liu commented on HBASE-20704:
-------------------------------------
[~apurtell] attached branch-1 patch. It's simpler but a bit different note that
I had to backport a change in FSDataInputStreamWrapper so an IOException is
thrown. Let me know if this looks good.
> Sometimes some compacted storefiles are not archived on region close
> --------------------------------------------------------------------
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
> Issue Type: Bug
> Components: Compaction
> Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
> Reporter: Francis Liu
> Assignee: Francis Liu
> Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch,
> HBASE-20704.003.patch, HBASE-20704.004.draft.patch, HBASE-20704.005.patch,
> HBASE-20704.006.patch, HBASE-20704.007.patch, HBASE-20704.branch-1.001.patch
>
>
> During region close compacted files which have not yet been archived by the
> discharger are archived as part of the region closing process. It is
> important that these files are wholly archived to insure data consistency. ie
> a storefile containing delete tombstones can be archived while older
> storefiles containing cells that were supposed to be deleted are left
> unarchived thereby undeleting those cells.
> On region close a compacted storefile is skipped from archiving if it has
> read references (ie open scanners). This behavior is correct for when the
> discharger chore runs but on region close consistency is of course more
> important so we should add a special case to ignore any references on the
> storefile and go ahead and archive it.
> Attached patch contains a unit test that reproduces the problem and the
> proposed fix.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)