[
https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934313#action_12934313
]
Alex Baranau commented on HBASE-2038:
-------------------------------------
The next might be helpful. Some time ago I refactored a bit IHBase and
extracted base interfaces to make easier develop any custom indexed
implementation: IndexManager and IndexScannerContext (I just published code
here: https://github.com/abaranau/ihbase). To give an idea, here's how they
look:
{noformat}
public abstract class IndexManager implements HeapSize {
public abstract void initialize(IdxRegion region);
public abstract Pair<Long, Callable<Void>> rebuildIndexes() throws
IOException;
public abstract void cleanup();
public abstract IndexScannerContext newIndexScannerContext(Scan scan) throws
IOException;
}
public interface IndexScannerContext {
KeyValue getNextKey();
void close();
}
{noformat}
This refactored code somehow correlates with BaseRegionObserverCoprocessor API
(at least in my head):
* IndexManager's initialize() should be invoked at region open time,
* rebuildIndexes() during flush,
* cleanup() during region close,
* IndexScannerContext should be created during open scan,
* it's getNextKey() (with a bit of addional code) during scan's next()
* and finally, close() when scan is closed in region.
I'm not saying we should use this refactored version of code, I'm just putting
it here for better "visualization purposes", just as a way to express the idea.
Please, correct my logic where needed!
Thanks!
P.S. Please don't judge the names of classes, other dirty pieces in the
refactored version I've shared, I wanted to a) just *try* to add aditional
abstraction to be able to inject custom indexing implementation and b) make as
little changes in IHBase codebase as I can so that others can follow them
easily. Copypasting notes that I took during refactoring (may be helpful if
someone wants to go inside the code):
"1. Extracted interface IndexManager from IdxRegionIndexManager.
2. Extracted separate IdxRegionIndexManagerMBean from IdxRegionMbean with
IndexManager-implementation-specific info
3. Created IndexScannerContext interface (with IdxScannerContext
implementation, which encapsulates idxSearchContext and
matchedExpressionIterator for existing code) which performs iteration over
indexed expression keys.
NOTE: Didn't think about renaming class IdxRegionIndexManager and related.
NOTE: Didn't think about "repackaging" things.
NOTE: New code/classes lack javadocs, will add them
NOTE: Unit-tests should be added with regard to refactoring (add check
IdxRegionIndexManagerMBean values at least)"
> Coprocessors: Region level indexing
> -----------------------------------
>
> Key: HBASE-2038
> URL: https://issues.apache.org/jira/browse/HBASE-2038
> Project: HBase
> Issue Type: Sub-task
> Reporter: Andrew Purtell
> Priority: Minor
>
> HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a
> good goalpost for coprocessor environment design -- there should be enough of
> it so region level indexing can be reimplemented as a coprocessor without any
> loss of functionality.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.