[jira] [Commented] (HBASE-11803) Programming model for reverse scan is confusing

James Taylor (JIRA) Sat, 23 Aug 2014 11:02:42 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-11803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108074#comment-14108074
 ]


James Taylor commented on HBASE-11803:
--------------------------------------

Thanks for all the ideas, feedback, and workarounds everyone - much 
appreciated. 

bq. The number of bytes needed is indeterminate from the client API's 
perspective. It will vary by application keying strategy.
This is a very good point. I _think_ I've reasoned out that from a Phoenix POV, 
adding a single 0xFF byte is sufficient.

bq. We could do something in the context of this issue like add a static helper 
method in Scan that makes all the necessary transformations
>From an API POV, I think this would be an improvement. Phoenix will likely 
>stick with what it's doing now for a couple of reasons: 1) we wouldn't want to 
>introduce a runtime dependency on a later 0.98 HBase version for an issue 
>we've already worked around, and 2) I'd worry that there's unnecessary 
>overhead in adding Filters (unnecessary in that if I can reason out how many 
>0xFF bytes to add to prevent any issues).

My reason for filing the JIRA is more around just giving my two cents on where 
I think HBase APIs can be improved. Phoenix hides all the complexity and 
nuances of using the HBase API by providing a well understood SQL API on top of 
it (that's part of it's value). Please take/leave my feedback as you see fit.

Ideally, it'd be nice if HBase had a KeyRange class that includes: byte[] 
lowerRange, boolean lowerInclusive, byte[] upperRange, boolean upperInclusive. 
Then Scan would contain a KeyRange. I realize this is likely infeasible to 
change in HBase at the point, though.  Maybe in 2.0? :-)

> Programming model for reverse scan is confusing
> -----------------------------------------------
>
>                 Key: HBASE-11803
>                 URL: https://issues.apache.org/jira/browse/HBASE-11803
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 0.98.1
>            Reporter: James Taylor
>            Assignee: Ted Yu
>
> The reverse scan is a very nice feature in HBase. We leverage it in Apache 
> Phoenix 4.1 when possible and see a huge boost in performance over 
> re-ordering the result set ourselves.
> However, the way in which you have to adjust the start/stop key is confusing. 
> Our use case is that we have a scan that needs to be done and we've 
> calculated an inclusive start row and an exclusive stop row. This is the way 
> region boundaries are which is convenient as they can easily be intersected 
> against the scan stop/start row. When we use a reverse scan, we are forced to 
> switch the start and stop row values of the scan *and* adjust the byte values 
> from inclusive to exclusive and from exclusive to inclusive. The former is 
> not too bad, as you can just add a zero byte, but the latter is problematic. 
> You can decrease the last byte by one, but you need to add an indeterminate 
> 0xFF bytes to ensure you're not including a row unintentionally.
> IMHO, it would be much cleaner to just keep the start/stop row as is and just 
> set  call the Scan.setReversed(true) method.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11803) Programming model for reverse scan is confusing

Reply via email to