[
https://issues.apache.org/jira/browse/HBASE-19001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206869#comment-16206869
]
Duo Zhang commented on HBASE-19001:
-----------------------------------
OK, the problem of Tephra is for flush and compaction. There are two things,
first it sets to read all versions, second it adds a Filter.
I think the first one is not a problem for flush/compaction, we always read all
versions when flush/compaction. The flush/compaction for MOB maybe different
but it is OK I think? The MOB file works like an external storage.
For the filter, the code is
{code}
static class IncludeInProgressFilter extends FilterBase {
private final long visibilityUpperBound;
private final Set<Long> invalidIds;
private final Filter txFilter;
public IncludeInProgressFilter(long upperBound, Collection<Long> invalids,
Filter transactionFilter) {
this.visibilityUpperBound = upperBound;
this.invalidIds = Sets.newHashSet(invalids);
this.txFilter = transactionFilter;
}
@Override
public ReturnCode filterKeyValue(Cell cell) throws IOException {
// include all cells visible to in-progress transactions, except for
those already marked as invalid
long ts = cell.getTimestamp();
if (ts > visibilityUpperBound) {
// include everything that could still be in-progress except invalids
if (invalidIds.contains(ts)) {
return ReturnCode.SKIP;
}
return ReturnCode.INCLUDE;
}
return txFilter.filterKeyValue(cell);
}
}
{code}
It just does filterKeyValue, so I think it is easy to change to use a wrap of
InternalScanner and do filtering on the Cell list returned by
InternalScanner.next. There is a example:
https://github.com/apache/hbase/blob/master/hbase-examples/src/main/java/org/apache/hadoop/hbase/coprocessor/example/ZooKeeperScanPolicyObserver.java
{code}
private InternalScanner wrap(InternalScanner scanner) {
OptionalLong optExpireBefore = getExpireBefore();
if (!optExpireBefore.isPresent()) {
return scanner;
}
long expireBefore = optExpireBefore.getAsLong();
return new DelegatingInternalScanner(scanner) {
@Override
public boolean next(List<Cell> result, ScannerContext scannerContext)
throws IOException {
boolean moreRows = scanner.next(result, scannerContext);
result.removeIf(c -> c.getTimestamp() < expireBefore);
return moreRows;
}
};
}
{code}
Thanks.
> Remove the hooks in RegionObserver which are designed to construct a
> StoreScanner which is marked as IA.Private
> ---------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-19001
> URL: https://issues.apache.org/jira/browse/HBASE-19001
> Project: HBase
> Issue Type: Sub-task
> Components: Coprocessors
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Fix For: 2.0.0-alpha-4
>
> Attachments: HBASE-19001.patch
>
>
> There are three methods here
> {code}
> KeyValueScanner
> preStoreScannerOpen(ObserverContext<RegionCoprocessorEnvironment> c,
> Store store, Scan scan, NavigableSet<byte[]> targetCols,
> KeyValueScanner s, long readPt)
> throws IOException;
> InternalScanner
> preFlushScannerOpen(ObserverContext<RegionCoprocessorEnvironment> c,
> Store store, List<KeyValueScanner> scanners, InternalScanner s, long
> readPoint)
> throws IOException;
> InternalScanner
> preCompactScannerOpen(ObserverContext<RegionCoprocessorEnvironment> c,
> Store store, List<? extends KeyValueScanner> scanners, ScanType
> scanType, long earliestPutTs,
> InternalScanner s, CompactionLifeCycleTracker tracker,
> CompactionRequest request,
> long readPoint) throws IOException;
> {code}
> For the flush and compact ones, we've discussed many times, it is not safe to
> let user inject a Filter or even implement their own InternalScanner using
> the store file scanners, as our correctness highly depends on the complicated
> logic in SQM and StoreScanner. CP users are expected to wrap the original
> InternalScanner(it is a StoreScanner anyway) in preFlush/preCompact methods
> to do filtering or something else.
> For preStoreScannerOpen it even returns a KeyValueScanner which is marked as
> IA.Private... This is less hurt but still, we've decided to not expose
> StoreScanner to CP users so here this method is useless. CP users can use
> preGetOp and preScannerOpen method to modify the Get/Scan object passed in to
> inject into the scan operation.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)