[
https://issues.apache.org/jira/browse/SPARK-9042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061453#comment-15061453
]
Vijay Singh commented on SPARK-9042:
------------------------------------
Hi Charmee,
You can invoke spark-shell or spark-submit in following fasion to gain access
to hivecontext functionality. Here is an example for spark-shell
{code}
HADOOP_CONF_DIR=/etc/hive/conf spark-shell --master yarn-client
--driver-class-path '/opt/cloudera/parcels/CDH/lib/hive/lib/*'
--driver-java-options
'-Dspark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hive/lib/*'
{code}
Additionally, the service/user account's group can be granted access to
metastore in following fashion if metastore access is restricted.
# Go to Cloudera Manager > Hive > Configuration > Service-Wide > Proxy > Hive
Metastore Access Control and Proxy User Groups Override
# Add the group name for {color:red} all service account and users that should
require hive metastore access if required {color} in addition to hive and hue
users.
# Restart the Hive Metastore Server for the changes to take effect.
> Spark SQL incompatibility if security is enforced on the Hive warehouse
> -----------------------------------------------------------------------
>
> Key: SPARK-9042
> URL: https://issues.apache.org/jira/browse/SPARK-9042
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.2.0
> Reporter: Nitin Kak
>
> Hive queries executed from Spark using HiveContext use CLI to create the
> query plan and then access the Hive table directories(under
> /user/hive/warehouse/) directly. This gives AccessContolException if Apache
> Sentry is installed:
> org.apache.hadoop.security.AccessControlException: Permission denied:
> user=kakn, access=READ_EXECUTE,
> inode="/user/hive/warehouse/mastering.db/sample_table":hive:hive:drwxrwx--t
> With Apache Sentry, only "hive" user(created only for Sentry) has the
> permissions to access the hive warehouse directory. After Sentry
> installations all the queries are directed to HiveServer2 which translates
> the changes the invoking user to "hive" and then access the hive warehouse
> directory. However, HiveContext does not execute the query through
> HiveServer2 which is leading to the issue. Here is an example of executing
> hive query through HiveContext.
> val hqlContext = new HiveContext(sc) // Create context to run Hive queries
> val pairRDD = hqlContext.sql(hql) // where hql is the string with hive query
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]