[ 
https://issues.apache.org/jira/browse/HBASE-18989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201410#comment-16201410
 ] 

Duo Zhang commented on HBASE-18989:
-----------------------------------

As discussed in HBASE-18906, we should give CP users the ability to know when a 
compaction is end. We already have a CompactionLifeCycleTracker but the problem 
is that it will not notify user if the compaction can not be scheduled.

And also, as we decide not to expose StoreScanner to CP users, then it does not 
make sense to allow CP users to return an InternalScanner before we actually 
create the StoreScanner in our own code. In the example in HBASE-18747 I wrap 
the InternalScanner and then do filtering in the preCompact method. I think 
this is the correct way to do filtering on compaction and flush.

The limitation of this solution is that, we can only remove data when 
compaction or flush. In the old example, we can reset the TTL in ScanInfo to 
include more data.But I think this is acceptable as you can use a longer 
TTL(such as for ever) to include the data, and also set KEEP_DELETE_CELLS to 
true and increase the versions to let compaction and flush give you the data 
you want, and then do filtering.

Another problem maybe performance. When using the original filter or other 
things such as TTL, the StoreScanner may do a seek other than skip if you want 
to jump to the next row or column, but for now you can only do skip. But I 
think this is OK for most cases as usually a row will not be very large. And 
compaction is not on the critical path of normal operation.

Thanks.

> Polish the compaction related CP hooks
> --------------------------------------
>
>                 Key: HBASE-18989
>                 URL: https://issues.apache.org/jira/browse/HBASE-18989
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction, Coprocessors
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0-alpha-4
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to