[jira] [Issue Comment Edited] (HBASE-3842) Refactor Coprocessor Compaction API

Gary Helmling (JIRA) Mon, 01 Aug 2011 17:58:51 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13075985#comment-13075985
 ]


Gary Helmling edited comment on HBASE-3842 at 8/2/11 12:57 AM:
---------------------------------------------------------------

I think the stacking issue is key here:  are we expecting the common case to be 
loading a single "CompactionObserver" that overrides the compaction 
implementation, or loading multiple that each override/customize compaction 
policy but not necessarily behavior?

I agree on the one hand that having a {{KeyValue}} oriented interface for 
{{preCompactWrite()}} and {{postCompactWrite()}} may not be sufficient.  At the 
same time, I don't think we want to force the implementations to write their 
own {{StoreFiles}} though, as that will be massively inefficient -- for N 
loaded coprocessors this becomes N compactions being written (assuming we 
bypass the core compaction code at the end of chaining).

One alternative would be to have {{preCompact}} take the scanner to be used as 
a parameter, as suggested, and return a scanner instance that would allow 
overriding policy and mutating KVs, while still relying on the core writer 
functionality.  This would allow wrapping the store scanner with a custom 
scanner that inspects and emits KVs as needed on the fly.  In this case, 
{{preCompact}} would look like:

{code}
StoreScanner preCompact(ObserverContext<~> context, Store store, StoreScanner 
scanner);
{code}

Wrapping the scanner seems much easier for chaining multiple observers.  On the 
other hand we lose the clean {{boolean}} return to indicate that core 
compaction processing should be skipped.  Are there cases that would still want 
to handling the store file writing portion of the implementation entirely in 
the coprocessor?  If so, can we still emit a flag to skip normal processing 
another way?  We could skip normal processing if {{null}} is returned.  Seems a 
little clunky, but it could work with appropriate documentation.

      was (Author: ghelmling):
    I think the stacking issue is key here:  are we expecting the common case 
to be loading a single "CompactionObserver" that overrides the compaction 
implementation, or loading multiple that each override/customize compaction 
policy but not necessarily behavior?

I agree on the one hand that having a {{KeyValue}} oriented interface for 
{{preCompactWrite()}} and {{postCompactWrite()}} may not be sufficient.  At the 
same time, I don't think we want to force the implementations to write their 
own {{StoreFiles}} though, as that will be massively inefficient -- for N 
loaded coprocessors this becomes N compactions being written (assuming we 
bypass the core compaction code at the end of chaining).

One alternative would be to have {{preCompact}} take the scanner to be used as 
a parameter, as suggested, and return a scanner instance that would allow 
overriding policy and mutating KVs, while still relying on the core writer 
functionality.  This would allow wrapping the store scanner with a custom 
scanner that inspects and emits KVs as needed on the fly.  In this case, 
{{preCompact}} would look like:

{{code}}
StoreScanner preCompact(ObserverContext<~> context, Store store, StoreScanner 
scanner);
{{code}}

Wrapping the scanner seems much easier for chaining multiple observers.  On the 
other hand we lose the clean {{boolean}} return to indicate that core 
compaction processing should be skipped.  Are there cases that would still want 
to handling the store file writing portion of the implementation entirely in 
the coprocessor?  If so, can we still emit a flag to skip normal processing 
another way?  We could skip normal processing if {{null}} is returned.  Seems a 
little clunky, but it could work with appropriate documentation.
  
> Refactor Coprocessor Compaction API
> -----------------------------------
>
>                 Key: HBASE-3842
>                 URL: https://issues.apache.org/jira/browse/HBASE-3842
>             Project: HBase
>          Issue Type: Improvement
>          Components: coprocessors, regionserver
>    Affects Versions: 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>              Labels: compaction
>             Fix For: 0.92.0
>
>
> After HBASE-3797, the compaction logic flow has been significantly altered.  
> Because of this, the current compaction coprocessor API is insufficient for 
> gaining full insight into compaction requests/results.  Refactor coprocessor 
> API after HBASE-3797 is committed to be more extensible and increase 
> visibility.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HBASE-3842) Refactor Coprocessor Compaction API

Reply via email to