Replying to document my fix: I was able to trick spark into working by setting my hostname to my preferred ip address.
ie $ sudo hostname 192.168.250.47 Not sure if this is a good idea in general, but it worked well enough for me to develop with my macbook driving the cluster through the vpn. On Sat, Oct 5, 2013 at 7:15 PM, Aaron Babcock <[email protected]> wrote: > hmm, that did not seem to do it. > > Interestingly the problem only appears with > rdd.take(1) > > rdd.collect() works just fine > > On Sat, Oct 5, 2013 at 4:49 PM, Aaron Davidson <[email protected]> wrote: >> You might try also setting spark.driver.host to the correct IP in the >> conf/spark-env.sh SPARK_JAVA_OPTs as well. >> >> e.g., >> -Dspark.driver.host=192.168.250.47 >> >> >> >> On Sat, Oct 5, 2013 at 2:45 PM, Aaron Babcock <[email protected]> >> wrote: >>> >>> Hello, >>> >>> I am using spark through a vpn. My driver machine ends up with two ip >>> addresses, one routable from the cluster and one not. >>> >>> Things generally work when I set the SPARK_LOCAL_IP environment >>> variable to the proper ip address. >>> >>> However, when I try to use the take function ie: myRdd.take(1), I run >>> into a hiccup. From the logfiles on the workers I can see that they >>> trying to connect to the nonroutable ip address, they are not >>> respecting SPARK_LOCAL_IP somehow. >>> >>> Here is the relevant worker log snippet, 192.168.250.47 is the correct >>> routable ip address of the driver, 192.168.0.7 is the incorrect >>> address of the driver. Any thoughts about what else I need to >>> configure? >>> >>> 13/10/05 16:17:36 INFO ConnectionManager: Accepted connection from >>> [192.168.250.47/192.168.250.47] >>> 13/10/05 16:18:41 WARN SendingConnection: Error finishing connection >>> to /192.168.0.7:60513 >>> java.net.ConnectException: Connection timed out >>> at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method) >>> at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java) >>> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) >>> at spark.network.SendingConnection.finishConnect(Connection.scala:221) >>> at >>> spark.network.ConnectionManager.spark$network$ConnectionManager$$run(ConnectionManager.scala:127) >>> at spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:70) >>> 13/10/05 16:18:41 INFO ConnectionManager: Handling connection error on >>> connection to ConnectionManagerId(192.168.0.7,60513) >>> 13/10/05 16:18:41 INFO ConnectionManager: Removing SendingConnection >>> to ConnectionManagerId(192.168.0.7,60513) >> >>
