Table metacolumns
-----------------

                 Key: HBASE-2893
                 URL: https://issues.apache.org/jira/browse/HBASE-2893
             Project: HBase
          Issue Type: New Feature
            Reporter: Andrew Purtell


Some features like TTLs or access control lists have use cases that call for 
per-value configurability. 

Currently in HBase TTLs are set per column family. This leads to potentially 
awkward "bucketing" of values into column families set up to accommodate the 
common desired TTLs for all values within -- an unnecessarily wide schema, with 
resulting unnecessary reduction in I/O locality in access patterns, more store 
files than otherwise, and so on.

Over in HBASE-1697 we're considering setting ACLs on column families. However, 
we are aware of other BT-like systems which support per-value ACLs. This allows 
for multitenancy in a single table as opposed to really requiring tables for 
each customer (or, at least column families). The scale out properties for a 
single table are better than alternatives. I think supporting per-row ACLs 
would be generally sufficient: customer ID could be part of the row key. We can 
still plan to maintain column-family level ACLs. We would therefore not have to 
bloat the store with per-row ACLs for the normal case -- but it would be highly 
useful to support overrides for particular rows. So how to do that?

I propose to introduce _metacolumns_. 

A _metacolumn_ would be a column family intrinsic to every table, created by 
the system at table create time.  It would be accessible like any other column 
family, but we expect a default ACL that only allows access by the system and 
operator principals, and would function like any other, except administrative 
actions such as renaming or deletion would not be allowed.  Into the metacolumn 
would be stored per-row overrides for such things as ACLs and TTLs. The 
metacolumn therefore would be as sparse as possible; no storage would required 
for any overrides if a value is committed with defaults. A reasonably sparse 
metacolumn for a region may fit entirely within blockcache. It may be possible 
for all metacolumns on a RS to fit within blockcache without undue pressure on 
other users. We can aim design effort at this target. 

The scope of changes required to support this is:

- Introduce metacolumn concept in the code and into the security model (default 
ACL): A flag in HCD, a default ACL, and a few additional checks for rejecting 
disallowed administrative actions.

- Automatically create metacolumns at table create time.

- Consult metatable as part of processing reads or mutations, perhaps using a 
bloom filter to shortcut lookups for rows with no metaentries, and apply 
configuration or security policy overrides if found.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to