[ 
https://issues.apache.org/jira/browse/HBASE-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957356#comment-13957356
 ] 

Andrew Purtell commented on HBASE-10854:
----------------------------------------

For the common use case where multiple data sets with different visiblity 
labels are combined into a single large table, the user will use a schema with 
MAX_VERSIONS > 1. Then users with differing authorizations will read from the 
table and for some the latest version(s) will be what we are supposed to 
return, and for others "older" version(s) are what we are supposed to return. 
"Inconsistent" views over the multiple versions, depending on user 
authorizations, is the desired behavior.

As Anoop says:

bq. The visibility based evaluation and cell filtering will happen in Filter 
level while on a top layer (after this filtering) the filtering based on the 
number of max versions will happen. (In SQM)

HBase internal handling of multiple cell versions can produce surprising 
behavior when using visibility labels. The code is functioning correctly, as 
long as multiple versions with different labels are accessible to the scanner, 
then the scanner will filter out what is not visible and return what is. 

If I can suggest a way to proceed, it would be:
1. Try out the visibility labels feature.
2. Where you find the behavior surprising to you in some way, describe your 
observations on this issue or others
3. In some cases we can document the behavior you are observing in the online 
manual as expected, with some advice
4. In other cases, we can look at changing how HBase internally handles 
multiple versions of cells to avoid surprising behavior we think by consensus 
should be considered incorrect or ugly


> Multiple Row/VisibilityLabels visible while in the memstore
> -----------------------------------------------------------
>
>                 Key: HBASE-10854
>                 URL: https://issues.apache.org/jira/browse/HBASE-10854
>             Project: HBase
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.98.1
>            Reporter: Matteo Bertozzi
>
> If we update the row multiple times with different visibility labels
> we are able to get the "old version" of the row until is flushed
> {code}
> $ sudo -u hbase hbase shell
> hbase> add_labels 'A'
> hbase> add_labels 'B'
> hbase> create 'tb', 'f1'
> hbase> put 'tb', 'row', 'f1:q', 'v1', {VISIBILITY=>'A'}
> hbase> put 'tb', 'row', 'f1:q', 'v1all'
> hbase> put 'tb', 'row', 'f1:q', 'v1aOrB', {VISIBILITY=>'A|B'}
> hbase> put 'tb', 'row', 'f1:q', 'v1aAndB', {VISIBILITY=>'A&B'}
> hbase> scan 'tb'
> row column=f1:q, timestamp=1395948168154, value=v1aAndB
> 1 row
> $ sudo -u testuser hbase shell
> hbase> scan 'tb'
> row column=f1:q, timestamp=1395948168102, value=v1all
> 1 row
> {code}
> When we flush the memstore we get a single row (the last one inserted)
> so the testuser get 0 rows now.
> {code}
> $ sudo -u hbase hbase shell
> hbase> flush 'tb'
> hbase> scan 'tb'
> row column=f1:q, timestamp=1395948168154, value=v1aAndB
> 1 row
> $ sudo -u testuser hbase shell
> hbase> scan 'tb'
> 0 row
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to