[
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923733#action_12923733
]
Carl Steinbach commented on HIVE-78:
------------------------------------
The issue that Todd raised is pretty important and needs to be addressed in the
proposal.
My personal opinion is that running all queries as a "hive" super-user is the
most
practical approach and will also yield behavior that is familiar to users of
traditional
RDBMS systems (who I expect will increasingly define the average Hive
user/administrator).
There are some other follow-on issues that need to be decided if we end up
settling
on this approach:
* This approach to authorization presupposes that users are accessing Hive
through a HiveServer process. This follows from the fact that A) you want Hive
to execute the query plans as the Hive superuser, and B) that user can
circumvent the authorization model if they are given direct access to the
MetaStore DB. It would be nice if the proposal explicitly stated this
requirement and mentioned some of the follow-on work that this necessitates,
e.g. fixing concurrency issues in HiveServer, reducing the memory requirements
of HiveServer, etc.
* We need to apply the authorization model to the '{{add [archive|file|jar]}}'
commands as well as {{add temorary function}}. {{add jar}} and {{add file}}
both currently allow the user to inject code into MR jobs, and {{add jar}} in
conjunction with {{add temporary function}} allows the user to inject and
execute arbitrary code within the HiveServer process. We may also want to add a
new {{add executable}} command for adding executable scripts that has a
different permission model than {{add file}}.
* I think there also may be security issues stemming from external tables, e.g.
if I create an external table that points to another user's home directory and
then run a query on it which executes with Hive's superuser permissions.
* Loading date into the Hive warehouse from an arbitrary HDFS location and
exporting data to other locations in HDFS are two issues that need to be
considered. In each case I think the correct behavior depends on both the Hive
process's permissions and those of the user.
> Authorization infrastructure for Hive
> -------------------------------------
>
> Key: HIVE-78
> URL: https://issues.apache.org/jira/browse/HIVE-78
> Project: Hive
> Issue Type: New Feature
> Components: Metastore, Query Processor, Server Infrastructure
> Reporter: Ashish Thusoo
> Assignee: He Yongqiang
> Attachments: createuser-v1.patch, hive-78-metadata-v1.patch,
> hive-78-syntax-v1.patch, hive-78.diff
>
>
> Allow hive to integrate with existing user repositories for authentication
> and authorization infromation.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.