[
https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Saisai Shao closed SPARK-21819.
-------------------------------
Resolution: Not A Problem
> UserGroupInformation initialization in SparkHadoopUtilwill overwrite user
> config
> ---------------------------------------------------------------------------------
>
> Key: SPARK-21819
> URL: https://issues.apache.org/jira/browse/SPARK-21819
> Project: Spark
> Issue Type: Bug
> Components: Java API, YARN
> Affects Versions: 2.1.0, 2.1.1
> Environment: Ubuntu14.04
> Spark2.10/2.11 (I checked the github of 2.20 , it exist there as well)
> Cluster mode: Yarn client
> Reporter: Keith Sun
> Attachments: yarnsparkutil.jpg
>
>
> When submit job in Java or Scala code to ,the initialization of
> SparkHadoopUtil will trigger the configuration overwritten in UGI which may
> not be expected if the UGI has already been initialized by customized xmls
> which are not on the classpath (like the cfg4j , which could set conf from
> github code, a database etc).
> {code:java}
> //it will overwrite the UGI conf which has already been initialized
> class SparkHadoopUtil extends Logging {
> private val sparkConf = new SparkConf(false).loadFromSystemProperties(true)
> val conf: Configuration = newConfiguration(sparkConf)
> UserGroupInformation.setConfiguration(conf)
> {code}
> My scenario : My yarn cluster is kerberized, my configuration is set to use
> kerberos for hadoop security. While, after the initialzation of
> SparkHadoopUtil , the authentiationMethod in UGI is updated to "simple"(my
> xmls not on the classpath), which lead to the failure like below :
> {code:java}
> 933453 [main] INFO org.apache.spark.SparkContext - Successfully stopped
> SparkContext
> Exception in thread "main" org.apache.hadoop.security.AccessControlException:
> SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
> at
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
> at
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:209)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy16.getClusterMetrics(Unknown Source)
> at
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:501)
> at
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:154)
> at
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:154)
> at org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
> at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:60)
> at
> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:153)
> at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
> at
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:149)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:497)
> at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2258)
> at
> org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831)
> at
> org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823)
> at scala.Option.getOrElse(Option.scala:121)
> at
> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823)
> at SparkTest.SparkEAZDebug.main(SparkEAZDebug.java:84)
> Caused by:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
> SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]
> at org.apache.hadoop.ipc.Client.call(Client.java:1426)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at com.sun.proxy.$Proxy15.getClusterMetrics(Unknown Source)
> at
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:206)
> ... 22 more
> {code}
> Sample code :
> {code:java}
> Configuration hc = new Configuration(false);
> String yarnxml=String.format("%s/%s",
> ConfigLocation,"yarn-site.xml");
> String corexml=String.format("%s/%s",
> ConfigLocation,"core-site.xml");
> String hdfsxml=String.format("%s/%s",
> ConfigLocation,"hdfs-site.xml");
> String hivexml=String.format("%s/%s",
> ConfigLocation,"hive-site.xml");
> hc.addResource(yarnxml);
> hc.addResource(corexml);
> hc.addResource(hdfsxml);
> hc.addResource(hivexml);
> SparkConf sc = new SparkConf(true);
> hc.forEach(entry-> {
> if(entry.getKey().startsWith("hive")) {
> sc.set(entry.getKey(),
> entry.getValue());
> }else {
> sc.set("spark.hadoop."+entry.getKey(),
> entry.getValue());
> }
> });
> UserGroupInformation.setConfiguration(hc);
> UserGroupInformation.loginUserFromKeytab(Principal, Keytab);
>
> SparkSession sparkSessesion= SparkSession
> .builder()
> .master("yarn-client") //"yarn-client", "local"
> .config(sc)
> .appName(SparkEAZDebug.class.getName())
> .enableHiveSupport()
> .getOrCreate();
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]