[
https://issues.apache.org/jira/browse/HBASE-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894437#action_12894437
]
Todd Lipcon commented on HBASE-2893:
------------------------------------
I agree with Jonathan's sentiment that we should try to fit this kind of thing
into a framework rather than core if possible.
Regarding the use case of per-cell ACLs, it is a requirement for a lot of
government users, where each piece of information may have a different security
clearance, and clearance is very granularly controlled. I could see
implementing this, though, by using a coprocessor which intercepts all
reads/writes and for every column cf:foo first checks a cf:_acl_foo before
returning results or passing through the write
Regarding the multitenancy use case, I imagine an infrastructure-as-a-service
deployment of HBase would probably be going through some intermediary layer
anyway to give users the illusion that they aren't on a shared deployment. EG
any access would have "user_foo_" prepended to all row keys. Having security
integration is important to authenticate the user, but per-row ACLs seems
expensive for that use case.
> Table metacolumns
> -----------------
>
> Key: HBASE-2893
> URL: https://issues.apache.org/jira/browse/HBASE-2893
> Project: HBase
> Issue Type: New Feature
> Reporter: Andrew Purtell
>
> Some features like TTLs or access control lists have use cases that call for
> per-value configurability.
> Currently in HBase TTLs are set per column family. This leads to potentially
> awkward "bucketing" of values into column families set up to accommodate the
> common desired TTLs for all values within -- an unnecessarily wide schema,
> with resulting unnecessary reduction in I/O locality in access patterns, more
> store files than otherwise, and so on.
> Over in HBASE-1697 we're considering setting ACLs on column families.
> However, we are aware of other BT-like systems which support per-value ACLs.
> This allows for multitenancy in a single table as opposed to really requiring
> tables for each customer (or, at least column families). The scale out
> properties for a single table are better than alternatives. I think
> supporting per-row ACLs would be generally sufficient: customer ID could be
> part of the row key. We can still plan to maintain column-family level ACLs.
> We would therefore not have to bloat the store with per-row ACLs for the
> normal case -- but it would be highly useful to support overrides for
> particular rows. So how to do that?
> I propose to introduce _metacolumns_.
> A _metacolumn_ would be a column family intrinsic to every table, created by
> the system at table create time. It would be accessible like any other
> column family, but we expect a default ACL that only allows access by the
> system and operator principals, and would function like any other, except
> administrative actions such as renaming or deletion would not be allowed.
> Into the metacolumn would be stored per-row overrides for such things as ACLs
> and TTLs. The metacolumn therefore would be as sparse as possible; no storage
> would required for any overrides if a value is committed with defaults. A
> reasonably sparse metacolumn for a region may fit entirely within blockcache.
> It may be possible for all metacolumns on a RS to fit within blockcache
> without undue pressure on other users. We can aim design effort at this
> target.
> The scope of changes required to support this is:
> - Introduce metacolumn concept in the code and into the security model
> (default ACL): A flag in HCD, a default ACL, and a few additional checks for
> rejecting disallowed administrative actions.
> - Automatically create metacolumns at table create time.
> - Consult metacolumn as part of processing reads or mutations, perhaps using
> a bloom filter to shortcut lookups for rows with no metaentries, and apply
> configuration or security policy overrides if found.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.