Re: Strange behavior of spark-shell while accessing hdfs
Thanks guys for the info. I have to use yarn to access a kerberos cluster. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Strange-behavior-of-spark-shell-while-accessing-hdfs-tp18549p18677.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Strange behavior of spark-shell while accessing hdfs
Only YARN mode is supported with kerberos. You can't use a spark:// master with kerberos. Tobias Pfeiffer wrote > When you give a "spark://*" master, Spark will run on a different machine, > where you have not yet authenticated to HDFS, I think. I don't know how to > solve this, though, maybe some Kerberos token must be passed on to the > Spark cluster? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Strange-behavior-of-spark-shell-while-accessing-hdfs-tp18549p18658.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Strange behavior of spark-shell while accessing hdfs
You need to set the spark configuration property: spark.yarn.access.namenodes to your namenode. e.g. spark.yarn.access.namenodes=hdfs://mynamenode:8020 Similarly, I'm curious if you're also running high availability HDFS with an HA nameservice. I currently have HA HDFS and kerberos and I've noticed that I must set the above property to the currently active namenode's hostname and port. Simply using the HA nameservice to get delegation tokens does NOT seem to work with Spark 1.1.0 (even though I can confirm the token is acquired). I believe this may be a bug. Unfortunately simply adding both the active and standby name nodes does not work as this actually causes an error. This means that when my active name node fails over, my spark configuration becomes invalid. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Strange-behavior-of-spark-shell-while-accessing-hdfs-tp18549p18656.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Strange behavior of spark-shell while accessing hdfs
Hi, On Tue, Nov 11, 2014 at 2:04 PM, hmxxyy wrote: > > If I run bin/spark-shell without connecting a master, it can access a hdfs > file on a remote cluster with kerberos authentication. [...] However, if I start the master and slave on the same host and using > bin/spark-shell --master spark://*.*.*.*:7077 > run the same commands [... ] > org.apache.hadoop.security.AccessControlException: Client cannot > authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: > "*.*.*.*.com/98.138.236.95"; destination host is: "*.*.*.*":8020; > When you give no master, it is "local[*]", so Spark will (implicitly?) authenticate to HDFS from your local machine using local environment variables, key files etc., I guess. When you give a "spark://*" master, Spark will run on a different machine, where you have not yet authenticated to HDFS, I think. I don't know how to solve this, though, maybe some Kerberos token must be passed on to the Spark cluster? Tobias