[ 
https://issues.apache.org/jira/browse/HBASE-24321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099497#comment-17099497
 ] 

Geoffrey Jacoby commented on HBASE-24321:
-----------------------------------------

[~zhangduo] - A coproc developer who overrides a compaction coproc and 
overrides MinVersions on ScanOptions has exactly one thing to worry about: that 
he or she has set the MinVersions to the wrong value, and hence their cluster 
may retain too many, or too few versions on one or more tables/CFs.

A coproc developer who writes a complicated Scanner wrapper has the above to 
worry about, as well as the prospect of resource leaks and accidentally 
crashing a RegionServer or the entire cluster. (As I said, I've been there. :-) 
)  

I don't see how the latter option is ever better, if both options are 
available. Of course, there will be times when a coproc developer wants to do 
something really complicated, and needs to do the latter option with all its 
extra risks -- but a good API should make simple things easy and safe, and 
limit the reasons you need to do dangerous ones. A good API exists here -- it 
just needs to be extended. 

What I'm proposing is pretty much exactly the same as HBASE-19895, which added 
KeepDeletedCells to ScanOptions -- and which you were the approving committer 
for, I believe.  :-) 

> Add writable MinVersions and read-only Scan to coproc ScanOptions
> -----------------------------------------------------------------
>
>                 Key: HBASE-24321
>                 URL: https://issues.apache.org/jira/browse/HBASE-24321
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Geoffrey Jacoby
>            Assignee: Geoffrey Jacoby
>            Priority: Major
>             Fix For: 2.3.0
>
>
> Between HBase 1.x and 2.0, the RegionObserver pre*ScannerOpen coprocessors 
> were significantly changed so that the coproc implementer no longer has 
> access to the actual Scanner, just a ScanOptions object that can be changed 
> in limited ways. This is safer and prevents resource leaks and other bugs. 
> While ScanOptions provides support for changing TTL, KeepDeletedCells, and 
> MaxVersions, a fourth column family config parameter, MinVersions, appears to 
> have been missed. This prevents coproc implementers from changing MinVersions 
> dynamically. An example of this is PHOENIX-5645, which in the forthcoming 
> Phoenix 4.16 (based on HBase 1.x) will allow users to configure a moving 
> window where all versions are kept, and thus point-in-time queries are safe. 
> This cannot be put in the forthcoming Phoenix 5.1 (based on HBase 2.1 and 
> 2.2) because of the coproc changes.
> Relatedly, preStoreScannerOpen lacks access to the Scan in HBase 2.0 and up. 
> This prevents coprocs from reading the Scan parameters to check if, for 
> example, a Scan has set the max time to a point in the past, and thus needs 
> to override KeepDeletedCells. This can lead to incorrect behavior when doing 
> point-in-time queries or using transactional engines that treat physically 
> committed HBase writes as logically uncommitted parts of a transaction. It's 
> also a correctness problem for PHOENIX-5645. Please note that only 
> _read-only_ access to the Scan from the store scanner coproc hook is in scope 
> for this change.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to