[
https://issues.apache.org/jira/browse/HBASE-24321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099497#comment-17099497
]
Geoffrey Jacoby commented on HBASE-24321:
-----------------------------------------
[~zhangduo] - A coproc developer who overrides a compaction coproc and
overrides MinVersions on ScanOptions has exactly one thing to worry about: that
he or she has set the MinVersions to the wrong value, and hence their cluster
may retain too many, or too few versions on one or more tables/CFs.
A coproc developer who writes a complicated Scanner wrapper has the above to
worry about, as well as the prospect of resource leaks and accidentally
crashing a RegionServer or the entire cluster. (As I said, I've been there. :-)
)
I don't see how the latter option is ever better, if both options are
available. Of course, there will be times when a coproc developer wants to do
something really complicated, and needs to do the latter option with all its
extra risks -- but a good API should make simple things easy and safe, and
limit the reasons you need to do dangerous ones. A good API exists here -- it
just needs to be extended.
What I'm proposing is pretty much exactly the same as HBASE-19895, which added
KeepDeletedCells to ScanOptions -- and which you were the approving committer
for, I believe. :-)
> Add writable MinVersions and read-only Scan to coproc ScanOptions
> -----------------------------------------------------------------
>
> Key: HBASE-24321
> URL: https://issues.apache.org/jira/browse/HBASE-24321
> Project: HBase
> Issue Type: Improvement
> Reporter: Geoffrey Jacoby
> Assignee: Geoffrey Jacoby
> Priority: Major
> Fix For: 2.3.0
>
>
> Between HBase 1.x and 2.0, the RegionObserver pre*ScannerOpen coprocessors
> were significantly changed so that the coproc implementer no longer has
> access to the actual Scanner, just a ScanOptions object that can be changed
> in limited ways. This is safer and prevents resource leaks and other bugs.
> While ScanOptions provides support for changing TTL, KeepDeletedCells, and
> MaxVersions, a fourth column family config parameter, MinVersions, appears to
> have been missed. This prevents coproc implementers from changing MinVersions
> dynamically. An example of this is PHOENIX-5645, which in the forthcoming
> Phoenix 4.16 (based on HBase 1.x) will allow users to configure a moving
> window where all versions are kept, and thus point-in-time queries are safe.
> This cannot be put in the forthcoming Phoenix 5.1 (based on HBase 2.1 and
> 2.2) because of the coproc changes.
> Relatedly, preStoreScannerOpen lacks access to the Scan in HBase 2.0 and up.
> This prevents coprocs from reading the Scan parameters to check if, for
> example, a Scan has set the max time to a point in the past, and thus needs
> to override KeepDeletedCells. This can lead to incorrect behavior when doing
> point-in-time queries or using transactional engines that treat physically
> committed HBase writes as logically uncommitted parts of a transaction. It's
> also a correctness problem for PHOENIX-5645. Please note that only
> _read-only_ access to the Scan from the store scanner coproc hook is in scope
> for this change.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)