Bolke de Bruin created SPARK-10181:
--------------------------------------
Summary: In a kerberized environment Hive is not used with keytab
princip but with user principal
Key: SPARK-10181
URL: https://issues.apache.org/jira/browse/SPARK-10181
Project: Spark
Issue Type: Bug
Affects Versions: 1.5.0
Reporter: Bolke de Bruin
`bin/spark-submit --num-executors 1 --executor-cores 5 --executor-memory 5G
--driver-java-options -XX:MaxPermSize=4G --driver-class-path
lib/datanucleus-api-jdo-3.2.6.jar:lib/datanucleus-core-3.2.10.jar:lib/datanucleus-rdbms-3.2.9.jar:conf/hive-site.xml
--files conf/hive-site.xml --master yarn --principal sparkjob --keytab
/etc/security/keytabs/sparkjob.keytab --conf
spark.yarn.executor.memoryOverhead=18000 --conf
"spark.executor.extraJavaOptions=-XX:MaxPermSize=4G" --conf
spark.eventLog.enabled=false ~/test.py`
With:
#!/usr/bin/python
from pyspark import SparkContext
from pyspark.sql import HiveContext
sc = SparkContext()
sqlContext = HiveContext(sc)
query = """ SELECT * FROM fm.sk_cluster """
rdd = sqlContext.sql(query)
rdd.registerTempTable("test")
sqlContext.sql("CREATE TABLE wcs.test LOCATION '/tmp/test_gl' AS SELECT * FROM
test")
Ends up with:
Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
Permission denie
d: user=ua80tl, access=READ_EXECUTE,
inode="/tmp/test_gl/.hive-staging_hive_2015-08-24_10-43-09_157_78057390024057878
34-1/-ext-10000":sparkjob:hdfs:drwxr-x---
(Our umask denies read access to other by default)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]