[ 
https://issues.apache.org/jira/browse/HBASE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13393697#comment-13393697
 ] 

Lars George commented on HBASE-6222:
------------------------------------

bq. @Andy That's not essential to storing labels in a metacolumn, though it may 
be advisable for performance reasons.

Understood. I am not saying this is needed or that metacolumns do not work 
without. In fact, I think that they are very useful in the context you 
discussed with Matt, i.e. for example TTLs. I personally think there is a need 
for two optional features: a) metacolumns - which cover broader rules for a 
many columns or rows, and b) KV tags - which are carried as low as they can get 
to retain per cell information. 

So for TTL I would think that the tags are too low, yet for security I do think 
that metacolumns are too "weak" of a guarantee.

bq. @Andy and @Matt: So we may have this as a way to store tags inline with 
data, with dedup/optimize away if not needed; and we may have Lars' somehow tag 
structure addition to KV (Lars: what would that look like?). Worth doing a 
bake-off?

I think this is not either or, but - and Matt please correct me if mistaken - 
if we add Trie compression then we can leverage the implementation to handle 
it. If we decide not to merge the two, then we can use my suggestion of adding 
them to the KV optionally and we can handle the compression implications later.

bq. @Andy: We could agree on criteria such as: Tag storage optimized out if no 
tags present

Indeed, since we use a new type, no extra storage is needed if no tag is 
attached.

bq. @Andy: Compartmentalized changes

Agreed, we add a new type and handle that case separately. Though the majority 
of the code is shared, the new type would trigger the extraction of the tags if 
called for (which I assume would be done lazily).

bq. @Andy: Generic mechanism for adding, reading, removing, and modifying tags, 
usable by coprocessors.

These are the KeyValue.addTag(byte[] name, byte[] value) and 
KeyValue.getTag(byte[] name) helpers I was referring to. The coprocessors has 
full access that way, since the tags are carried for each KV.

bq. @Andy: No we don't have to mimic the Accumulo API though if the goal here 
is to be an alternative, it must be possible to build a direct API translation 
shim that provides the same labelling and visibility semantisc.

Indeed. One of the arguments I hear comparing HBase and Accumulo is the fact 
that we have no cell level security tagging. That is what this is all about. My 
proposal is - as much as I can tell - lean (as it uses no extra storage if not 
used), can be combined with the non-cell level security (you might not want 
this level of security to avoid extra baggage), does not change the 
comparators, and overall is quite non-intrusive in existing code. On the other 
hand it seems useful for other cell level features in the future.

As Jon says, Accumulo uses these tags and the always-on filter to achieve 
security (on a very high level view), and so can we then. For me this is 
comparable then. We do not need to comply to the entire API, but feature set 
level only.

bq. @stack: A core of required's with optional tags that don't cost unless you 
use them would be grand. 

That is exactly my point. As for "KV in KV", I do not see how this is "odd" as 
our KeyValue for starters is the odd one given what most people understand of 
what a KV is. Coming to terms with our complex key and various sorting rules is 
not trivial.

bq. @stack: Good point. Maybe not even lost, mayhaps a bug would cause us skip 
the metacolumn?

Spot on!

bq. @Matt: I guess I'm saying it's maybe ok to muck up the current KV even more 
given that data block encoding should be able to clean up the mess down the 
road. That being said, I don't personally need this feature so I hate to 
suggest mucking up anything!

Agreed, this is about timing as well. Your patch is highly intrusive - but for 
good reasons. So I would love to discuss this current issue with your changes 
already applied. But on the other hand we have to make a call for what we want 
and when?

@Laxman: The basic premise here is to be on-par security wise with Accumulo. 
That is the use-case. As for scalability, I do not see why a few extra bytes 
and a coprocessor that checks them is disastrous. Sure, this needs evaluation, 
but we know that other systems - like Accumulo - does it, so if someone wants 
to enable it, they should see the same impact. Small or big. Or asking the 
other way around, where do you see this could affect the performance?

bq. How about other approach of supporting access control through HBase views?

The issue is that these are typically only on the row level. With the cell 
level you can filter as fine grained as possible. Views - and please object if 
I am wrong - are more coarse grained. Think of blocking access to some columns 
differently across many rows. Not just all CF/CQs allowed for all rows.

That latter is the crucial difference of what is needed to be on-par.
                
> Add per-KeyValue Security
> -------------------------
>
>                 Key: HBASE-6222
>                 URL: https://issues.apache.org/jira/browse/HBASE-6222
>             Project: HBase
>          Issue Type: New Feature
>          Components: security
>            Reporter: stack
>
> Saw an interesting article: 
> http://www.fiercegovernmentit.com/story/sasc-accumulo-language-pro-open-source-say-proponents/2012-06-14
> "The  Senate Armed Services Committee version of the fiscal 2013 national 
> defense authorization act (S. 3254) would require DoD agencies to foreswear 
> the Accumulo NoSQL database after Sept. 30, 2013, unless the DoD CIO 
> certifies that there exists either no viable commercial open source database 
> with security features comparable to [Accumulo] (such as the HBase or 
> Cassandra databases)..."
> Not sure what a 'commercial open source database' is, and I'm not sure whats 
> going on in the article, but tra-la-la'ing, if we had per-KeyValue 'security' 
> like Accumulo's, we might put ourselves in the running for federal 
> contributions?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to