[
https://issues.apache.org/jira/browse/HBASE-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465471#comment-13465471
]
Andrew Purtell edited comment on HBASE-6805 at 9/28/12 7:13 PM:
----------------------------------------------------------------
[~chenghao] Thank you for the patch.
However, the attached patch is missing unit tests and any example of
filter+scan wrapping. We can evaluate the API changes in isolation, but without
a good understanding of your motivation, why the changes are necessary is not
clear. Why not use custom filters?
In HBASE-6577 see [~lhofhansl]'s comment
https://issues.apache.org/jira/browse/HBASE-6577?focusedCommentId=13433898&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13433898.
If extending the CP hook model to internal filter methods, we must be deeply
concerned about the costs of iterating CP hook lists during filtering/scanning.
CPs extend the code path, first of all. Then, if hooks are registered, there
will be method invocation and object allocation costs for _every_ filter
operation, twice.
Have you tried benchmarking filter performance with and without the proposed
changes? What is the difference in a realistic scan performance example between
stock HBase and a HBase with this patch applied? With one "filter coprocessor"
installed? With two?
Edit: Improved comment clarity.
was (Author: apurtell):
[~chenghao] Thank you for the patch.
However, the attached patch is missing unit tests and any example of
filter+scan wrapping. We can evaluate the API changes in isolation, but without
a good understanding of your motivation, why the changes are necessary is not
clear. Why not use custom filters?
In HBASE-6577 see [~lhofhansl]'s comment
https://issues.apache.org/jira/browse/HBASE-6577?focusedCommentId=13433898&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13433898.
It makes sense to use the CP framework to inject custom filter wrappers, or
similar, but if also extending the CP hook model to internal filter methods, I
worry about all of the iteration of CP hook lists during filtering/scanning.
There will be method invocation and object allocation costs there for _every_
filter operation, twice. Have you tried benchmarking filter performance with
and without the proposed changes?
> Extend co-processor framework to provide observers for filter operations
> ------------------------------------------------------------------------
>
> Key: HBASE-6805
> URL: https://issues.apache.org/jira/browse/HBASE-6805
> Project: HBase
> Issue Type: Sub-task
> Components: Coprocessors
> Affects Versions: 0.96.0
> Reporter: Jason Dai
> Attachments: extend_coprocessor.patch
>
>
> There are several filter operations (e.g., filterKeyValue, filterRow,
> transform, etc.) at the region server side that either exclude KVs from the
> returned results, or transform the returned KV. We need to provide observers
> (e.g., preFilterKeyValue and postFilterKeyValue) for these operations in the
> same way as the observers for other data access operations (e.g., preGet and
> postGet). This extension is needed to support DOT (e.g., extracting
> individual fields from the document in the observers before passing them to
> the related filter operations)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira