[ https://issues.apache.org/jira/browse/SPARK-12482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kyle Sutton updated SPARK-12482: -------------------------------- Affects Version/s: 1.5.2 > CLONE - Spark fileserver not started on same IP as using spark.driver.host > -------------------------------------------------------------------------- > > Key: SPARK-12482 > URL: https://issues.apache.org/jira/browse/SPARK-12482 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.2.1, 1.5.2 > Reporter: Kyle Sutton > > The issue of the file server using the default IP instead of the IP address > configured through {{spark.driver.host}} still exists in _Spark 1.5.2_ > The problem is that, while the file server is listening on all ports on the > file server host, the Spark service attempts to call back to the default port > of the host, to which it may or may not have connectivity. > For instance, the following setup causes a > {{java.net.SocketTimeoutException}} when the _Spark_ service tries to contact > the _Spark_ driver host for a JAR: > * Launch host has a default IP of {{192.168.1.2}} and a secondary LAN > connection IP of {{172.30.0.2}} > * A _Spark_ connection is made from the launch host to a service running on > that LAN, {{172.30.0.0/16}} > ** {{spark.driver.host}} is set to the IP of that _Spark_ service host > {{172.30.0.3}} > ** {{spark.driver.port}} is set to {{50003}} > ** {{spark.fileserver.port}} is set to {{50005}} > * Locally (on the launch host), the following listeners are active: > ** {{0.0.0.0:50005}} > ** {{172.30.0.2:50003}} > * The _Spark_ service calls back to the file server host for a JAR file using > the file server host's default IP: {{http://192.168.1.2:50005/jars/code.jar}} > * The _Spark_ service, being on a different network than the launch host, > cannot see the {{192.168.1.0/24}} address space, and fails to connect to the > file server > ** A {{netstat}} on the _Spark_ service host will show the connection to the > file server host as being in {{SYN_SENT}} state until the process gives up > trying to connect > Given that the configured {{spark.driver.host}} must necessarily be > accessible by the _Spark_ service, it makes sense to reuse it for fileserver > connections and any other _Spark_ service -> driver host connections. > ---- > Original report follows: > I initially inquired about this here: > http://mail-archives.apache.org/mod_mbox/spark-user/201503.mbox/%3ccalq9kxcn2mwfnd4r4k0q+qh1ypwn3p8rgud1v6yrx9_05lv...@mail.gmail.com%3E > If the Spark driver host has multiple IPs and spark.driver.host is set to one > of them, I would expect the fileserver to start on the same IP. I checked > HttpServer and the jetty Server is started the default IP of the machine: > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/HttpServer.scala#L75 > Something like this might work instead: > {code:title=HttpServer.scala#L75} > val server = new Server(new InetSocketAddress(conf.get("spark.driver.host"), > 0)) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org