[
https://issues.apache.org/jira/browse/HBASE-18989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201410#comment-16201410
]
Duo Zhang commented on HBASE-18989:
-----------------------------------
As discussed in HBASE-18906, we should give CP users the ability to know when a
compaction is end. We already have a CompactionLifeCycleTracker but the problem
is that it will not notify user if the compaction can not be scheduled.
And also, as we decide not to expose StoreScanner to CP users, then it does not
make sense to allow CP users to return an InternalScanner before we actually
create the StoreScanner in our own code. In the example in HBASE-18747 I wrap
the InternalScanner and then do filtering in the preCompact method. I think
this is the correct way to do filtering on compaction and flush.
The limitation of this solution is that, we can only remove data when
compaction or flush. In the old example, we can reset the TTL in ScanInfo to
include more data.But I think this is acceptable as you can use a longer
TTL(such as for ever) to include the data, and also set KEEP_DELETE_CELLS to
true and increase the versions to let compaction and flush give you the data
you want, and then do filtering.
Another problem maybe performance. When using the original filter or other
things such as TTL, the StoreScanner may do a seek other than skip if you want
to jump to the next row or column, but for now you can only do skip. But I
think this is OK for most cases as usually a row will not be very large. And
compaction is not on the critical path of normal operation.
Thanks.
> Polish the compaction related CP hooks
> --------------------------------------
>
> Key: HBASE-18989
> URL: https://issues.apache.org/jira/browse/HBASE-18989
> Project: HBase
> Issue Type: Sub-task
> Components: Compaction, Coprocessors
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Fix For: 2.0.0-alpha-4
>
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)