[
https://issues.apache.org/jira/browse/HBASE-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957356#comment-13957356
]
Andrew Purtell commented on HBASE-10854:
----------------------------------------
For the common use case where multiple data sets with different visiblity
labels are combined into a single large table, the user will use a schema with
MAX_VERSIONS > 1. Then users with differing authorizations will read from the
table and for some the latest version(s) will be what we are supposed to
return, and for others "older" version(s) are what we are supposed to return.
"Inconsistent" views over the multiple versions, depending on user
authorizations, is the desired behavior.
As Anoop says:
bq. The visibility based evaluation and cell filtering will happen in Filter
level while on a top layer (after this filtering) the filtering based on the
number of max versions will happen. (In SQM)
HBase internal handling of multiple cell versions can produce surprising
behavior when using visibility labels. The code is functioning correctly, as
long as multiple versions with different labels are accessible to the scanner,
then the scanner will filter out what is not visible and return what is.
If I can suggest a way to proceed, it would be:
1. Try out the visibility labels feature.
2. Where you find the behavior surprising to you in some way, describe your
observations on this issue or others
3. In some cases we can document the behavior you are observing in the online
manual as expected, with some advice
4. In other cases, we can look at changing how HBase internally handles
multiple versions of cells to avoid surprising behavior we think by consensus
should be considered incorrect or ugly
> Multiple Row/VisibilityLabels visible while in the memstore
> -----------------------------------------------------------
>
> Key: HBASE-10854
> URL: https://issues.apache.org/jira/browse/HBASE-10854
> Project: HBase
> Issue Type: Bug
> Components: security
> Affects Versions: 0.98.1
> Reporter: Matteo Bertozzi
>
> If we update the row multiple times with different visibility labels
> we are able to get the "old version" of the row until is flushed
> {code}
> $ sudo -u hbase hbase shell
> hbase> add_labels 'A'
> hbase> add_labels 'B'
> hbase> create 'tb', 'f1'
> hbase> put 'tb', 'row', 'f1:q', 'v1', {VISIBILITY=>'A'}
> hbase> put 'tb', 'row', 'f1:q', 'v1all'
> hbase> put 'tb', 'row', 'f1:q', 'v1aOrB', {VISIBILITY=>'A|B'}
> hbase> put 'tb', 'row', 'f1:q', 'v1aAndB', {VISIBILITY=>'A&B'}
> hbase> scan 'tb'
> row column=f1:q, timestamp=1395948168154, value=v1aAndB
> 1 row
> $ sudo -u testuser hbase shell
> hbase> scan 'tb'
> row column=f1:q, timestamp=1395948168102, value=v1all
> 1 row
> {code}
> When we flush the memstore we get a single row (the last one inserted)
> so the testuser get 0 rows now.
> {code}
> $ sudo -u hbase hbase shell
> hbase> flush 'tb'
> hbase> scan 'tb'
> row column=f1:q, timestamp=1395948168154, value=v1aAndB
> 1 row
> $ sudo -u testuser hbase shell
> hbase> scan 'tb'
> 0 row
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)