[
https://issues.apache.org/jira/browse/HBASE-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-947:
------------------------
Priority: Minor (was: Major)
Summary: [Optimization] Major compaction should remove deletes as well as
the deleted cell (was: Major compaction should remove deletes as well as the
deleted cell)
To be clear, Jim is suggesting an optimization, a minor one I believe.
Currently cells are only let go from a store file on a major compaction for
following reasons:
+ if > MAX_VERSIONS cells or
+ if cell timestamp is older than the configured TTL.
Under this regimen, we could keep around deletes though the cell they
overshadow may no longer be present (probably because > MAX_VERSIONS). The
suggestion here is that in this one case, we'd let go of the delete cell too
(if no corresponding deleted cell).
We don't want to let go of the deleted cell just because there is a delete
record on major compaction because then a user who took out a scanner that was
behind the delete cells timestamp but in front of the deleted cells timestamp
would get different results dependent on whether a major compaction had run or
not. Current rule, till we decide otherwise, is that they'd only see a
different result if MAX_VERSIONS or TTL had been exceeded.
Marking this issue minor rather than major.
> [Optimization] Major compaction should remove deletes as well as the deleted
> cell
> ---------------------------------------------------------------------------------
>
> Key: HBASE-947
> URL: https://issues.apache.org/jira/browse/HBASE-947
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Jim Kellerman
> Priority: Minor
> Fix For: 0.19.0
>
>
> Currently major compactions retains both deletes and the deleted cell. It
> should remove both.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.