[ 
https://issues.apache.org/jira/browse/HBASE-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13468086#comment-13468086
 ] 

Gary Helmling commented on HBASE-6805:
--------------------------------------

Looking at the encryption example here, it seems like you could provide that 
with the existing coprocessor hooks:

Option A:
# In {{EncryptingRegionObserver.preScannerOpen()}}, if any Filter is set on the 
Scan object, wrap it in a custom {{DecryptingFilterWrapper}}.  This would just 
decrypt the KVs before passing them on to the client provided Filter, 
essentially doing the same work your example preFilterXXX methods are doing.
# In {{EncryptingRegionObserver.postScannerNext()}}, again decrypt the final 
KVs being returned to the client, same as your example.

The duplicate decryption here seems unnecessary, but it should give you the 
same results as your provided example, without the need to add a batch of 
pre/postFilterXXX hooks to RegionObservers.

Option B:
# In {{EncryptingRegionObserver.preStoreScannerOpen()}} return a custom 
KeyValueScanner implementation that extends or wraps the default StoreScanner 
implementation.  Note that this would still be a little tricky since filters 
are applied down in ScanQueryMatcher.  For decryption what you would really 
want is to hook in above the StoreFileScanners and MemStoreScanners used 
internally by StoreScanner, but below the ScanQueryMatcher operations, so that 
you can decrypt each KV once as it's read.  Seems like that would currently 
require duplicating a fair amount of StoreScanner functionality.  Maybe 
something needs to be added to better hook in to this data reading layer?

The main issue I see is that the added hooks fuzz the line between Filters and 
RegionObservers and their areas of responsibility.  It doesn't seem like we 
should really need pre/postFilterXXX hooks, because that's what filters are 
supposed to provide.  And of course adding more Observer hooks does have a cost 
in increasing complexity of the coprocessor interfaces and added overhead 
(especially in hot code paths).

Are there really cases that require the pre/postFilter hooks that can't be 
accomplished by having a RegionObserver wrap gets/scans with it's own Filter 
implementation that coordinates with the RegionObserver instance?
                
> Extend co-processor framework to provide observers for filter operations
> ------------------------------------------------------------------------
>
>                 Key: HBASE-6805
>                 URL: https://issues.apache.org/jira/browse/HBASE-6805
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Coprocessors
>    Affects Versions: 0.96.0
>            Reporter: Jason Dai
>         Attachments: extend_coprocessor.patch
>
>
> There are several filter operations (e.g., filterKeyValue, filterRow, 
> transform, etc.) at the region server side that either exclude KVs from the 
> returned results, or transform the returned KV. We need to provide observers 
> (e.g., preFilterKeyValue and postFilterKeyValue) for these operations in the 
> same way as the observers for other data access operations (e.g., preGet and 
> postGet). This extension is needed to support DOT (e.g., extracting 
> individual fields from the document in the observers before passing them to 
> the related filter operations) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to