[
https://issues.apache.org/jira/browse/HBASE-24321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099478#comment-17099478
]
Geoffrey Jacoby commented on HBASE-24321:
-----------------------------------------
1. Wrapping an InternalScanner seems much riskier for a coprocessor author to
get right. (Some of my colleagues and I have spent some long debugging sessions
in the past trying to find resource leaks that came from coprocs that got new
scanner instantiation wrong!)
The 2.x pattern where HBase instantiates the scanner and the coproc author just
marks which parameters he or she wants to override is really nice, and I want
to leverage that. There's nothing about minVersions that makes it any different
from KeepDeletedCells, ttl, or max versions which already support this -- it's
simply a bit of column family config.
2. What about coprocs that need to access some piece of column-family metadata
plus the Scan to make the right choice? (For example, increasing max versions
for a CF if a Scan is raw, or checking some Scan attribute that triggers
special behavior only for certain Scans and CFs). In that case, you need the
Scan in preStoreScannerOpen, don't you? Again, not to _modify_ -- my
implementation which I'll post tomorrow makes a copy of the Scan -- but just to
read.
Nothing I'm proposing here is new, it's just taking former 1.x functionality
and adapting it to the 2.x design patterns. I have a draft I'm running tests on
now, and the change is quite small.
> Add writable MinVersions and read-only Scan to coproc ScanOptions
> -----------------------------------------------------------------
>
> Key: HBASE-24321
> URL: https://issues.apache.org/jira/browse/HBASE-24321
> Project: HBase
> Issue Type: Improvement
> Reporter: Geoffrey Jacoby
> Assignee: Geoffrey Jacoby
> Priority: Major
> Fix For: 2.3.0
>
>
> Between HBase 1.x and 2.0, the RegionObserver pre*ScannerOpen coprocessors
> were significantly changed so that the coproc implementer no longer has
> access to the actual Scanner, just a ScanOptions object that can be changed
> in limited ways. This is safer and prevents resource leaks and other bugs.
> While ScanOptions provides support for changing TTL, KeepDeletedCells, and
> MaxVersions, a fourth column family config parameter, MinVersions, appears to
> have been missed. This prevents coproc implementers from changing MinVersions
> dynamically. An example of this is PHOENIX-5645, which in the forthcoming
> Phoenix 4.16 (based on HBase 1.x) will allow users to configure a moving
> window where all versions are kept, and thus point-in-time queries are safe.
> This cannot be put in the forthcoming Phoenix 5.1 (based on HBase 2.1 and
> 2.2) because of the coproc changes.
> Relatedly, preStoreScannerOpen lacks access to the Scan in HBase 2.0 and up.
> This prevents coprocs from reading the Scan parameters to check if, for
> example, a Scan has set the max time to a point in the past, and thus needs
> to override KeepDeletedCells. This can lead to incorrect behavior when doing
> point-in-time queries or using transactional engines that treat physically
> committed HBase writes as logically uncommitted parts of a transaction. It's
> also a correctness problem for PHOENIX-5645. Please note that only
> _read-only_ access to the Scan from the store scanner coproc hook is in scope
> for this change.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)