[
https://issues.apache.org/jira/browse/SPARK-9042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060208#comment-15060208
]
Andrew Ray commented on SPARK-9042:
-----------------------------------
Sean, I think there are a couple issues going on here. In my experience with
the Sentry HDFS plugin, you can read tables just fine from spark (which was the
stated issue here). However there are other similar issues that are real, you
can't create/modify any tables. There are two issues there. First is HDFS
permissions, the sentry hdfs plugin only gives you read access. Second is Hive
metastore permissions, even if you create the table in some other hdfs location
that you have write access to you will still fail as you can't make
modifications to the hive metastore as it has a whitelist of users that is by
default set to just hive and impala.
> Spark SQL incompatibility if security is enforced on the Hive warehouse
> -----------------------------------------------------------------------
>
> Key: SPARK-9042
> URL: https://issues.apache.org/jira/browse/SPARK-9042
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.2.0
> Reporter: Nitin Kak
>
> Hive queries executed from Spark using HiveContext use CLI to create the
> query plan and then access the Hive table directories(under
> /user/hive/warehouse/) directly. This gives AccessContolException if Apache
> Sentry is installed:
> org.apache.hadoop.security.AccessControlException: Permission denied:
> user=kakn, access=READ_EXECUTE,
> inode="/user/hive/warehouse/mastering.db/sample_table":hive:hive:drwxrwx--t
> With Apache Sentry, only "hive" user(created only for Sentry) has the
> permissions to access the hive warehouse directory. After Sentry
> installations all the queries are directed to HiveServer2 which translates
> the changes the invoking user to "hive" and then access the hive warehouse
> directory. However, HiveContext does not execute the query through
> HiveServer2 which is leading to the issue. Here is an example of executing
> hive query through HiveContext.
> val hqlContext = new HiveContext(sc) // Create context to run Hive queries
> val pairRDD = hqlContext.sql(hql) // where hql is the string with hive query
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]