[ 
https://issues.apache.org/jira/browse/HBASE-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465471#comment-13465471
 ] 

Andrew Purtell edited comment on HBASE-6805 at 9/28/12 7:13 PM:
----------------------------------------------------------------

[~chenghao] Thank you for the patch. 

However, the attached patch is missing unit tests and any example of 
filter+scan wrapping. We can evaluate the API changes in isolation, but without 
a good understanding of your motivation, why the changes are necessary is not 
clear. Why not use custom filters? 

In HBASE-6577 see [~lhofhansl]'s comment
https://issues.apache.org/jira/browse/HBASE-6577?focusedCommentId=13433898&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13433898.

If extending the CP hook model to internal filter methods, we must be deeply 
concerned about the costs of iterating CP hook lists during filtering/scanning. 
CPs extend the code path, first of all. Then, if hooks are registered, there 
will be method invocation and object allocation costs for _every_ filter 
operation, twice.

Have you tried benchmarking filter performance with and without the proposed 
changes? What is the difference in a realistic scan performance example between 
stock HBase and a HBase with this patch applied? With one "filter coprocessor" 
installed? With two?

Edit: Improved comment clarity.
                
      was (Author: apurtell):
    [~chenghao] Thank you for the patch. 

However, the attached patch is missing unit tests and any example of 
filter+scan wrapping. We can evaluate the API changes in isolation, but without 
a good understanding of your motivation, why the changes are necessary is not 
clear. Why not use custom filters? 

In HBASE-6577 see [~lhofhansl]'s comment
https://issues.apache.org/jira/browse/HBASE-6577?focusedCommentId=13433898&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13433898.

It makes sense to use the CP framework to inject custom filter wrappers, or 
similar, but if also extending the CP hook model to internal filter methods, I 
worry about all of the iteration of CP hook lists during filtering/scanning. 
There will be method invocation and object allocation costs there for _every_ 
filter operation, twice. Have you tried benchmarking filter performance with 
and without the proposed changes?

                  
> Extend co-processor framework to provide observers for filter operations
> ------------------------------------------------------------------------
>
>                 Key: HBASE-6805
>                 URL: https://issues.apache.org/jira/browse/HBASE-6805
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Coprocessors
>    Affects Versions: 0.96.0
>            Reporter: Jason Dai
>         Attachments: extend_coprocessor.patch
>
>
> There are several filter operations (e.g., filterKeyValue, filterRow, 
> transform, etc.) at the region server side that either exclude KVs from the 
> returned results, or transform the returned KV. We need to provide observers 
> (e.g., preFilterKeyValue and postFilterKeyValue) for these operations in the 
> same way as the observers for other data access operations (e.g., preGet and 
> postGet). This extension is needed to support DOT (e.g., extracting 
> individual fields from the document in the observers before passing them to 
> the related filter operations) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to