It looks like its picking up the wrong namenode uri from the HADOOP_CONF_DIR, make sure it is proper. Also for submitting a spark job to a remote cluster, you might want to look at the spark.driver host and spark.driver.port
Thanks Best Regards On Wed, Jul 22, 2015 at 8:56 PM, rok <rokros...@gmail.com> wrote: > I am trying to run Spark applications with the driver running locally and > interacting with a firewalled remote cluster via a SOCKS proxy. > > I have to modify the hadoop configuration on the *local machine* to try to > make this work, adding > > <property> > <name>hadoop.rpc.socket.factory.class.default</name> > <value>org.apache.hadoop.net.SocksSocketFactory</value> > </property> > <property> > <name>hadoop.socks.server</name> > <value>localhost:9998</value> > </property> > > and on the *remote cluster* side > > <property> > <name>hadoop.rpc.socket.factory.class.default</name> > <value>org.apache.hadoop.net.StandardSocketFactory</value> > <final>true</final> > </property> > > With this setup, and running "ssh -D 9998 gateway.host" to start the proxy > connection, MapReduce jobs started on the local machine execute fine on the > remote cluster. However, trying to launch a Spark job fails with the nodes > of the cluster apparently unable to communicate with one another: > > java.io.IOException: Failed on local exception: java.net.SocketException: > Connection refused; Host Details : local host is: "node3/10.211.55.103"; > destination host is: "node1":8030; > > Looking at the packets being sent to node1 from node3, it's clear that no > requests are made on port 8030, hinting that the connection is somehow > being > proxied. > > Is it possible that the Spark job is not honoring the socket.factory > settings on the *cluster* side for some reason? > > Note that Spark JIRA 5004 > <https://issues.apache.org/jira/browse/SPARK-5004> addresses a similar > problem, though it looks like they are actually not the same (since in that > case it sounds like a standalone cluster is being used). > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/problems-running-Spark-on-a-firewalled-remote-YARN-cluster-via-SOCKS-proxy-tp23955.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >