Github user Tagar commented on the issue:
https://github.com/apache/spark/pull/16482
I see this problem in my setup too. The same code was working fine in Spark
1.5 and Spark 1.6,
and breaks with the same symptoms as in
[SPARK-19038](https://issues.apache.org/jira/browse/SPARK-19038) from Spark 2.0
onwards.
And this problem is not with ticket refresh, as it happens when you would
try to run a first query
in a fresh (and just created ) Spark context. Ticket refresh code works
after some time when ticket
is about to expire, not a few seconds after Spark Context has started, I
believe - maybe there is
a change in Spark 2.
> py4j.protocol.Py4JJavaError: An error occurred while calling o61.sql.
> : org.apache.spark.SparkException: Keytab file:
svc_odiprd.keytab-a1b98b7c-79fa-45b0-a80d-11953879a810 specified in
spark.yarn.keytab does not exist
> at
org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:113)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
> at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
> at
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:354)
> at
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:258)
> at
org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
> at
org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
> at
org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
> at
org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
> at
org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:50)
> at
org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
> at
org.apache.spark.sql.hive.HiveSessionState$$anon$1.<init>(HiveSessionState.scala:63)
> at
org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
> at
org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
> at
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
> . . .
We also set keytab and principal in Python code, that uses yarn-client:
```
conf = (SparkConf()
.setMaster('yarn-client')
.set("spark.yarn.keytab", kt_location)
.set("spark.yarn.principal", kt_principal)
)
sc = SparkContext(conf=conf)
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]