[
https://issues.apache.org/jira/browse/SPARK-9825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091718#comment-15091718
]
Rok Roskar commented on SPARK-9825:
-----------------------------------
I'm not sure who has the responsibility to honor the "final" property flag --
client or cluster side? If the "final" designation is ignored in general it has
the potential to be problematic in general not just in this use-case.
> Spark overwrites remote cluster "final" properties with local config
> ---------------------------------------------------------------------
>
> Key: SPARK-9825
> URL: https://issues.apache.org/jira/browse/SPARK-9825
> Project: Spark
> Issue Type: Bug
> Components: YARN
> Reporter: Rok Roskar
>
> Configuration options specified in the hadoop cluster *.xml config files can
> be marked as "final", indicating that they should not be overwritten by a
> client's configuration. Spark appears to be over-writing those options, the
> symptom of which is that local proxy settings overwrite the cluster-side
> proxy settings. This breaks things when trying to run jobs on a remote,
> firewalled, YARN cluster.
> For example, with the configuration below, one should be able to establish a
> SOCKS proxy via ssh -D to a host that can "see" the cluster, and then submit
> jobs and run the driver on the local desktop/laptop:
> Remote cluster-side core-site.xml:
> {code:xml}
> <property>
> <name>hadoop.rpc.socket.factory.class.default</name>
> <value>org.apache.hadoop.net.StandardSocketFactory</value>
> <final>true</final>
> </property>
> {code}
> This configuration ensures that the nodes within the cluster never use a
> proxy to talk to each other.
> Local client-side core-site.xml:
> {code:xml}
> <property>
> <name>hadoop.rpc.socket.factory.class.default</name>
> <value>org.apache.hadoop.net.SocksSocketFactory</value>
> </property>
> <property>
> <name>hadoop.socks.server</name>
> <value>localhost:9999</value>
> </property>
> {code}
> Indeed, running a standard MapReduce job, the log files show that an override
> of a property marked <final> is attempted:
> {code}
> 2015-07-27 15:26:11,706 WARN [main] org.apache.hadoop.conf.Configuration:
> job.xml:an attempt to override final parameter:
> hadoop.rpc.socket.factory.class.default; Ignoring.
> {code}
> and the MR job proceeds and finishes normally.
> On the other hand, a Spark job with the same configuration shows no such
> message and instead we see that the nodes within the cluster are not able to
> communicate:
> {code}
> 15/07/27 15:25:43 INFO client.RMProxy: Connecting to ResourceManager at
> node1/10.211.55.101:8030
> 15/07/27 15:25:43 INFO yarn.YarnRMClient: Registering the ApplicationMaster
> 15/07/27 15:25:44 INFO ipc.Client: Retrying connect to server:
> node1/10.211.55.101:8030. Already tried 0 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> MILLISECONDS)
> {code}
> Running tcpdump on the slave nodes shows that in the case of the MR job,
> packets are sent between slave nodes and the ResourceManager node indicating
> that no proxy is being used, while in the case of the Spark job no such
> connection is made.
> A further indication that the cluster-side configuration is altered is that
> if a dedicated proxy server is set up in a way that both sides can see it,
> i.e. the local core-site.xml is changed to have
> {code:xml}
> <property>
> <name>hadoop.socks.server</name>
> <value>node2:9999</value>
> </property>
> {code}
> the Spark job (and the MR job) run fine, with all connections going through
> the dedicated proxy server. While this works, it's sub-optimal because it now
> requires that such a server be created, which may not always be possible
> because it requires privileged access to the gateway machine.
> Therefore, it appears that Spark is perfectly happy running through a proxy
> in YARN mode, but that it garbles the cluster-side configuration even when
> properties are marked as {{<final>}}. I'm not sure if this is intended? Or is
> there some other way that preserving the "final" properties can be enforced?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]