It looks like its picking up the wrong namenode uri from the
HADOOP_CONF_DIR, make sure it is proper. Also for submitting a spark job to
a remote cluster, you might want to look at the spark.driver host and
spark.driver.port

Thanks
Best Regards

On Wed, Jul 22, 2015 at 8:56 PM, rok <rokros...@gmail.com> wrote:

> I am trying to run Spark applications with the driver running locally and
> interacting with a firewalled remote cluster via a SOCKS proxy.
>
> I have to modify the hadoop configuration on the *local machine* to try to
> make this work, adding
>
> <property>
>    <name>hadoop.rpc.socket.factory.class.default</name>
>    <value>org.apache.hadoop.net.SocksSocketFactory</value>
> </property>
> <property>
>    <name>hadoop.socks.server</name>
>    <value>localhost:9998</value>
> </property>
>
> and on the *remote cluster* side
>
> <property>
>     <name>hadoop.rpc.socket.factory.class.default</name>
>     <value>org.apache.hadoop.net.StandardSocketFactory</value>
>     <final>true</final>
> </property>
>
> With this setup, and running "ssh -D 9998 gateway.host" to start the proxy
> connection, MapReduce jobs started on the local machine execute fine on the
> remote cluster. However, trying to launch a Spark job fails with the nodes
> of the cluster apparently unable to communicate with one another:
>
> java.io.IOException: Failed on local exception: java.net.SocketException:
> Connection refused; Host Details : local host is: "node3/10.211.55.103";
> destination host is: "node1":8030;
>
> Looking at the packets being sent to node1 from node3, it's clear that no
> requests are made on port 8030, hinting that the connection is somehow
> being
> proxied.
>
> Is it possible that the Spark job is not honoring the socket.factory
> settings on the *cluster* side for some reason?
>
> Note that  Spark JIRA 5004
> <https://issues.apache.org/jira/browse/SPARK-5004>   addresses a similar
> problem, though it looks like they are actually not the same (since in that
> case it sounds like a standalone cluster is being used).
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/problems-running-Spark-on-a-firewalled-remote-YARN-cluster-via-SOCKS-proxy-tp23955.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to