[
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526346#comment-16526346
]
Mike Drob commented on HBASE-20704:
-----------------------------------
bq. If we add in each created compacted storefile metadata about which
storefiles were it's parents. So on the next region open we can remove the
compacted storefiles as part of the region open operation. What do you guys
think?
I think this would cause 1-2 additional NN list operations per file in a
region? When opening a large table with many regions, I worry about swamping
the NN more than we already do. Maybe it's ok though.
> Sometimes some compacted storefiles are not archived on region close
> --------------------------------------------------------------------
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
> Issue Type: Bug
> Components: Compaction
> Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
> Reporter: Francis Liu
> Assignee: Francis Liu
> Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the
> discharger are archived as part of the region closing process. It is
> important that these files are wholly archived to insure data consistency. ie
> a storefile containing delete tombstones can be archived while older
> storefiles containing cells that were supposed to be deleted are left
> unarchived thereby undeleting those cells.
> On region close a compacted storefile is skipped from archiving if it has
> read references (ie open scanners). This behavior is correct for when the
> discharger chore runs but on region close consistency is of course more
> important so we should add a special case to ignore any references on the
> storefile and go ahead and archive it.
> Attached patch contains a unit test that reproduces the problem and the
> proposed fix.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)