I spin up a spark standalone cluster (spark.autheticate=false), submitted a
job which reads remote kerberized HDFS,
val spark = SparkSession.builder()
.master("spark://spark-standalone:7077")
.getOrCreate()
UserGroupInformation.loginUserFromKeytab(principal, keytab)
val df = spark.read.parquet("hdfs://namenode:8020/test/parquet/")
Ran into following exception:
Caused by:
java.io.IOException: java.io.IOException: Failed on local exception:
java.io.IOException: org.apache.hadoop.security.AccessControlException:
Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host
is: "..."; destination host is: "...":10346;
Any suggestions?
Thanks
Sudhir