Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Hive/AuthDev" page has been changed by HeYongqiang.
http://wiki.apache.org/hadoop/Hive/AuthDev

--------------------------------------------------

New page:

= 1. Privilege =

== 1.1 Access Privilege ==

Admin privilege, DB privilege, Table level privilege, column level privilege

1.1.1 Admin privileges are global privileges, and are used to perform 
administration.

1.1.2 DB privileges are database specific, and apply to all objects inside that 
database.

1.1.3 Table privileges apply to table/view/index in a given database

1.1.4 Column privileges apply to column level.

All DB/Table/Column privilege differentiate read and write privileges even 
though now hive does not support column level overwrite. And there is no 
partition level privilege.

= 2. Hive Operations =

create index/drop index

create database/drop database

create table/drop table

create view/drop view

alter table

show databases

lock table/unlock table/show lock

add partition

archive

Select

insert overwrite directory

insert overwrite table

others include "create table as ", "create table like" etc

= 3. Metadata =

Store the privilege information in the new metastore table 'user', 'db', 
['host'], 'tables_priv', 'columns_priv'.

The user table indicates user's global privileges, which apply to all databases.
The db table determine database level access privileges, which apply to all 
objects inside that database.

The hots table is used to constrain the host names from which the privileges 
are granted to the given user. 
[I am not sure if we need to have this table.]


== 3.1 user, group, and roles ==

User can belong to some groups. The group information is provided by 
authenticator.

And each user or group can have some roles. And role can be a member of a role, 
but can not in a circular.

So hive metadata needs to store:

1) roles

2) Hive user/group -> role mapping

=== 3.1.1 Role management ===

create role

drop role

add a user to a role

remove a user from a role

=== 3.1.2 role metadata ===

role_name - string

create_time - int

=== 3.1.3 hive role user membership table ===

role_name - string

user_name - string

is_group -- is the user name a group name

is_role  -- is the user name a role name


== 3.2 Privileges to be supported by Hive ==

=== 3.2.1 metadata ===

The below shows how we store the grant information in metastore. The deny 
information is stored in a same matter (just in different tables). 

So for each grant table, there will also be a deny table. The metastore tables 
are

user, deny_user, db, deny_db, tables_priv, deny_tables_priv, columns_priv, 
deny_columns_priv

Another way to do it is to add a column in the grant table to record this row 
is grant or deny.


We store privileges in one column, and use comma to separate different 
privileges.


hive> desc user;

Field 

- - - - 

User

isRole

isGroup

isSuper

db_priv -- set (Select_priv, Insert_priv, Create_priv, Drop_priv, Reload_priv, 

                Grant_priv, Index_priv, Alter_priv, Show_db_priv,

                Lock_tables_priv, Create_view_priv, Show_view_priv)

hive> desc db;  

Field    

- - - - 

Db

User

isRole

isGroup

Table_priv  -- set (Select_priv, Insert_priv, Create_priv, Drop_priv, 
Grant_priv, 

                    Index_priv, Reload_priv, Alter_priv, Create_tmp_table_priv, 
 
                    Lock_tables_priv, Create_view_priv, Show_view_priv)


hive> desc tables_priv;


Field

- - - - 

Db

User

isRole

isGroup

Table_name

Grantor

Timestamp

Table_priv  -- 
set('Select','Insert',,'Create','Drop','Grant','Index','Alter','Create 
View','Show view') 

Column_priv -- set('Select','Insert',)                                          
                                  


mysql> desc columns_priv;

Field

- - - -     

Db          

User        

isRole

isGroup

Table_name  

Column_name 

Timestamp   

Column_priv -- set('Select','Insert','Update')


= 4. grant/revoke access privilege =

== 4.1 Privilege names/types: ==

ALL Privileges

ALTER

Create

Create temporary tables

Ceate view

Delete

Drop

Index

Insert

Lock Tables

Select 

Show databases

show view

Super

Update


== 4.2 show grant ==

== 4.3 grant/revoke statement ==

GRANT
    priv_type [(column_list)]
      [, priv_type [(column_list)]] ...
    ON [object_type] priv_level
    TO user [, user] ...
WITH ADMIN OPTION

object_type:
    TABLE

priv_level:
    *
  | *.*
  | db_name.*
  | db_name.tbl_name
  | tbl_name

REVOKE
    priv_type [(column_list)]
      [, priv_type [(column_list)]] ...
    ON [object_type] priv_level
    FROM user [, user] ...

REVOKE ALL PRIVILEGES, GRANT OPTION
    FROM user [, user] ...


DENY  
        priv_type [(column_list)]
      [, priv_type [(column_list)]] ...
    ON [object_type] priv_level
    FROM user [, user] ...

= 5. Authorization verification =

== 5.1 USER/GROUP/ROLE ==

USER

GROUP

ROLE

GROUP is very similar to a role. And we support Group is because we may need to 
pass the group information to HDFS/Map-reduce. But role does not need to be a 
group.

Role can be nested but not circular.


[
In Oracle, a role groups several privileges and roles, so that they can be 
granted to and revoked from users simultaneously. A role must be enabled for a 
user before it can be used by the user. And in Oracle, there is role 
Authorization. Create role/Drop role requires CREATE ROLE system privilege to 
perform.
]

== 5.2 The verification steps ==

When a user logins to the system, he has a user name, one or few groups that he 
belongs to. And he also may be granted to some roles.
So it is 

[

username, 

list of group names, 

list of roles that has been directly granted to himself, 

list of roles that been directly granted groups that users belongs to

].

First try user name:

first try to deny this access by look up the deny tables by user name:


1. If there is an entry in 'user' that deny this access, return DENY

2. If there is an entry in 'db'  that deny this access, return DENY

3. If there is an entry in 'table'  that deny this access, return DENY

4. If there is an entry in 'column'  that deny this access, return DENY



if deny failed, go through all privilege levels with the user name:


5. If there is an entry in 'user' that accept this access, return ACCEPT

6. If there is an entry in 'db'  that accept this access, return ACCEPT

7. If there is an entry in 'table'  that accept this access, return ACCEPT

8. If there is an entry in 'column'  that accept this access, return ACCEPT



Second try the user's group/role names one by one until we get an ACCEPT or 
DENY. If we get one DENY from one group/role, will DENY this access. 


For each role/group, we do the same routine as we did for user name.


= 5.3 Examples =


5.3.1 I want to grant everyone (new people may join at anytime) to
db_name.*, and then later i want to protect one table db_name.T from ALL
users but a few


1) Add all users to a group 'users'. (assumption: new users will
automatically join this group). And grant 'users' ALL privileges to db_name.*

2) Add those few users to a new group 'users2'. AND REMOVE them from 'users'

3) DENY 'users' to db_name.T

4) Grant ALL on db_name.T to users2


5.3.2 I want to protect one table db_name.T from one/few users, but all
other people can access it

1) Add all users to a group 'users'. (assumption: new users will automatically
join this group). And grant 'users' ALL privileges to db_name.*.

2) Add those few users to a new group 'users2'. (Note: those few users will now
belong to 2 groups: users and user2)

3) DENY 'users2' to db_name.T


= 6. Where to add authorization in Hive =

CliDriver and HiveServer. Basically they share the same code. If HiveServer 
invokes CliDriver, we can just add it into CliDriver. And we also need to make 
HiveServer be able to support multiple user/connections.

= 7. Implementation =

== 7.1 Authenticator interface ==

We only get the user's user name, group names from the authenticator. The 
authenticator implementations need to provide these information. This is the 
only interface between authenticator and authorization.

== 7.2 Authorization ==

Authorization decision manager manages a set of authorization provider, and 
each provider can decide to accept or deny. And it is the decision manager to 
do the final decision. Can be vote based, or one -1 then deny, or one +1 then 
accept. Authorization provider decides whether to accept or deny an access 
based on his own information.

------------

= HDFS Permission =
The above has a STRONG assumption on the file layer security. Users can easily 
by-pass the security if the hdfs file permission is open to him. We hope we can 
be able to plug in external authorizations (like HDFS permission) easily to 
alter the authorization result or even the rule.

Reply via email to