RE: Worker failed to connect when build with SPARK_HADOOP_VERSION=2.2.0
What version of code you are using? 2.2.0 support not yet merged into trunk. Check out https://github.com/apache/incubator-spark/pull/199 Best Regards, Raymond Liu From: horia@gmail.com [mailto:horia@gmail.com] On Behalf Of Horia Sent: Monday, December 02, 2013 3:00 PM To: user@spark.incubator.apache.org Subject: Re: Worker failed to connect when build with SPARK_HADOOP_VERSION=2.2.0 Has this been resolved? Forgive me if I missed the follow-up but I've been having the exact same problem. - Horia On Fri, Nov 22, 2013 at 5:38 AM, Maxime Lemaire digital@gmail.com wrote: Hi all, When im building Spark with Hadoop 2.2.0 support, workers cant connect to Spark master anymore. Network is up and hostnames are correct. Tcpdump can clearly see workers trying to connect (tcpdump outputs at the end). Same set up with Spark build without SPARK_HADOOP_VERSION (or with SPARK_HADOOP_VERSION=2.0.5-alpha) is working fine ! Some details : pmtx-master01 : master pmtx-master02 : slave (behavior is the same if i launch both master and slave from the same box) Building HADOOP 2.2.0 support : fresh install on pmtx-master01 : # SPARK_HADOOP_VERSION=2.2.0 sbt/sbt assembly build successfull # fresh install on pmtx-master02 : # SPARK_HADOOP_VERSION=2.2.0 sbt/sbt assembly ...build successfull # On pmtx-master01 : # ./bin/start-master.sh starting org.apache.spark.deploy.master.Master, logging to /cluster/bin/spark-0.8.0-incubating/bin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-pmtx-master01.out # netstat -an | grep 7077 tcp6 0 0 10.90.XX.XX:7077 :::* LISTEN # On pmtx-master02 : # nc -v pmtx-master01 7077 pmtx-master01 [10.90.XX.XX] 7077 (?) open # ./spark-class org.apache.spark.deploy.worker.Worker spark://pmtx-master01:7077 13/11/22 10:57:50 INFO Slf4jEventHandler: Slf4jEventHandler started 13/11/22 10:57:50 INFO Worker: Starting Spark worker pmtx-master02:42271 with 8 cores, 22.6 GB RAM 13/11/22 10:57:50 INFO Worker: Spark home: /cluster/bin/spark 13/11/22 10:57:50 INFO WorkerWebUI: Started Worker web UI at http://pmtx-master02:8081 13/11/22 10:57:50 INFO Worker: Connecting to master spark://pmtx-master01:7077 13/11/22 10:57:50 ERROR Worker: Connection to master failed! Shutting down. # With spark-shell on pmtx-master02 : # MASTER=spark://pmtx-master01:7077 ./spark-shell Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 0.8.0 /_/ Using Scala version 2.9.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_31) Initializing interpreter... Creating SparkContext... 13/11/22 11:19:29 INFO Slf4jEventHandler: Slf4jEventHandler started 13/11/22 11:19:29 INFO SparkEnv: Registering BlockManagerMaster 13/11/22 11:19:29 INFO MemoryStore: MemoryStore started with capacity 323.9 MB. 13/11/22 11:19:29 INFO DiskStore: Created local directory at /tmp/spark-local-20131122111929-3e3c 13/11/22 11:19:29 INFO ConnectionManager: Bound socket to port 42249 with id = ConnectionManagerId(pmtx-master02,42249) 13/11/22 11:19:29 INFO BlockManagerMaster: Trying to register BlockManager 13/11/22 11:19:29 INFO BlockManagerMaster: Registered BlockManager 13/11/22 11:19:29 INFO HttpBroadcast: Broadcast server started at http://10.90.66.67:52531 13/11/22 11:19:29 INFO SparkEnv: Registering MapOutputTracker 13/11/22 11:19:29 INFO HttpFileServer: HTTP File server directory is /tmp/spark-40525f81-f883-45d5-92ad-bbff44ecf435 13/11/22 11:19:29 INFO SparkUI: Started Spark Web UI at http://pmtx-master02:4040 13/11/22 11:19:29 INFO Client$ClientActor: Connecting to master spark://pmtx-master01:7077 13/11/22 11:19:30 ERROR Client$ClientActor: Connection to master failed; stopping client 13/11/22 11:19:30 ERROR SparkDeploySchedulerBackend: Disconnected from Spark cluster! 13/11/22 11:19:30 ERROR ClusterScheduler: Exiting due to error from cluster scheduler: Disconnected from Spark cluster snip WORKING : Building HADOOP 2.0.5-alpha support On pmtx-master01, now im building hadoop 2.0.5-alpha : # sbt/sbt clean ... # SPARK_HADOOP_VERSION=2.0.5-alpha sbt/sbt assembly ... # ./bin/start-master.sh starting org.apache.spark.deploy.master.Master, logging to /cluster/bin/spark-0.8.0-incubating/bin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-pmtx-master01.out Same build on pmtx-master02 : # sbt/sbt clean ... build successfull ... # SPARK_HADOOP_VERSION=2.0.5-alpha sbt/sbt assembly ... build successfull ... # ./spark-class org.apache.spark.deploy.worker.Worker spark://pmtx-master01:7077 13/11/22 11:25:34 INFO Slf4jEventHandler: Slf4jEventHandler started 13/11/22 11:25:34 INFO Worker: Starting Spark worker pmtx-master02:33768 with 8 cores, 22.6 GB RAM 13/11/22 11:25:34 INFO Worker: Spark home: /cluster/bin/spark 13/11/22 11:25:34 INFO WorkerWebUI: Started Worker web UI at http://pmtx-master02:8081 13/11/22 11:25:34 INFO Worker: Connecting to master spark://pmtx
Re: Worker failed to connect when build with SPARK_HADOOP_VERSION=2.2.0
Horia, if you dont need yarn support you can get it work by setting SPARK_YARN to false : *SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=false sbt/sbt assembly* Raymond, Ok, thank you, so thats why, im using the lastest release 0.8.0 (september 25, 2013) 2013/12/2 Liu, Raymond raymond@intel.com What version of code you are using? 2.2.0 support not yet merged into trunk. Check out https://github.com/apache/incubator-spark/pull/199 Best Regards, Raymond Liu From: horia@gmail.com [mailto:horia@gmail.com] On Behalf Of Horia Sent: Monday, December 02, 2013 3:00 PM To: user@spark.incubator.apache.org Subject: Re: Worker failed to connect when build with SPARK_HADOOP_VERSION=2.2.0 Has this been resolved? Forgive me if I missed the follow-up but I've been having the exact same problem. - Horia On Fri, Nov 22, 2013 at 5:38 AM, Maxime Lemaire digital@gmail.com wrote: Hi all, When im building Spark with Hadoop 2.2.0 support, workers cant connect to Spark master anymore. Network is up and hostnames are correct. Tcpdump can clearly see workers trying to connect (tcpdump outputs at the end). Same set up with Spark build without SPARK_HADOOP_VERSION (or with SPARK_HADOOP_VERSION=2.0.5-alpha) is working fine ! Some details : pmtx-master01 : master pmtx-master02 : slave (behavior is the same if i launch both master and slave from the same box) Building HADOOP 2.2.0 support : fresh install on pmtx-master01 : # SPARK_HADOOP_VERSION=2.2.0 sbt/sbt assembly build successfull # fresh install on pmtx-master02 : # SPARK_HADOOP_VERSION=2.2.0 sbt/sbt assembly ...build successfull # On pmtx-master01 : # ./bin/start-master.sh starting org.apache.spark.deploy.master.Master, logging to /cluster/bin/spark-0.8.0-incubating/bin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-pmtx-master01.out # netstat -an | grep 7077 tcp6 0 0 10.90.XX.XX:7077:::*LISTEN # On pmtx-master02 : # nc -v pmtx-master01 7077 pmtx-master01 [10.90.XX.XX] 7077 (?) open # ./spark-class org.apache.spark.deploy.worker.Worker spark://pmtx-master01:7077 13/11/22 10:57:50 INFO Slf4jEventHandler: Slf4jEventHandler started 13/11/22 10:57:50 INFO Worker: Starting Spark worker pmtx-master02:42271 with 8 cores, 22.6 GB RAM 13/11/22 10:57:50 INFO Worker: Spark home: /cluster/bin/spark 13/11/22 10:57:50 INFO WorkerWebUI: Started Worker web UI at http://pmtx-master02:8081 13/11/22 10:57:50 INFO Worker: Connecting to master spark://pmtx-master01:7077 13/11/22 10:57:50 ERROR Worker: Connection to master failed! Shutting down. # With spark-shell on pmtx-master02 : # MASTER=spark://pmtx-master01:7077 ./spark-shell Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 0.8.0 /_/ Using Scala version 2.9.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_31) Initializing interpreter... Creating SparkContext... 13/11/22 11:19:29 INFO Slf4jEventHandler: Slf4jEventHandler started 13/11/22 11:19:29 INFO SparkEnv: Registering BlockManagerMaster 13/11/22 11:19:29 INFO MemoryStore: MemoryStore started with capacity 323.9 MB. 13/11/22 11:19:29 INFO DiskStore: Created local directory at /tmp/spark-local-20131122111929-3e3c 13/11/22 11:19:29 INFO ConnectionManager: Bound socket to port 42249 with id = ConnectionManagerId(pmtx-master02,42249) 13/11/22 11:19:29 INFO BlockManagerMaster: Trying to register BlockManager 13/11/22 11:19:29 INFO BlockManagerMaster: Registered BlockManager 13/11/22 11:19:29 INFO HttpBroadcast: Broadcast server started at http://10.90.66.67:52531 13/11/22 11:19:29 INFO SparkEnv: Registering MapOutputTracker 13/11/22 11:19:29 INFO HttpFileServer: HTTP File server directory is /tmp/spark-40525f81-f883-45d5-92ad-bbff44ecf435 13/11/22 11:19:29 INFO SparkUI: Started Spark Web UI at http://pmtx-master02:4040 13/11/22 11:19:29 INFO Client$ClientActor: Connecting to master spark://pmtx-master01:7077 13/11/22 11:19:30 ERROR Client$ClientActor: Connection to master failed; stopping client 13/11/22 11:19:30 ERROR SparkDeploySchedulerBackend: Disconnected from Spark cluster! 13/11/22 11:19:30 ERROR ClusterScheduler: Exiting due to error from cluster scheduler: Disconnected from Spark cluster snip WORKING : Building HADOOP 2.0.5-alpha support On pmtx-master01, now im building hadoop 2.0.5-alpha : # sbt/sbt clean ... # SPARK_HADOOP_VERSION=2.0.5-alpha sbt/sbt assembly ... # ./bin/start-master.sh starting org.apache.spark.deploy.master.Master, logging to /cluster/bin/spark-0.8.0-incubating/bin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-pmtx-master01.out Same build on pmtx-master02 : # sbt/sbt clean ... build successfull ... # SPARK_HADOOP_VERSION=2.0.5-alpha sbt/sbt assembly ... build successfull ... # ./spark-class org.apache.spark.deploy.worker.Worker spark://pmtx-master01:7077 13/11/22 11:25:34
Re: Worker failed to connect when build with SPARK_HADOOP_VERSION=2.2.0
Has this been resolved? Forgive me if I missed the follow-up but I've been having the exact same problem. - Horia On Fri, Nov 22, 2013 at 5:38 AM, Maxime Lemaire digital@gmail.comwrote: Hi all, When im building Spark with Hadoop 2.2.0 support, workers cant connect to Spark master anymore. Network is up and hostnames are correct. Tcpdump can clearly see workers trying to connect (tcpdump outputs at the end). Same set up with Spark build without SPARK_HADOOP_VERSION (or with SPARK_HADOOP_VERSION=2.0.5-alpha) is working fine ! Some details : pmtx-master01 : master pmtx-master02 : slave (behavior is the same if i launch both master and slave from the same box) Building HADOOP 2.2.0 support : fresh install on pmtx-master01 : # SPARK_HADOOP_VERSION=2.2.0 sbt/sbt assembly build successfull # fresh install on pmtx-master02 : # SPARK_HADOOP_VERSION=2.2.0 sbt/sbt assembly ...build successfull # On pmtx-master01 : # ./bin/start-master.sh starting org.apache.spark.deploy.master.Master, logging to /cluster/bin/spark-0.8.0-incubating/bin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-pmtx-master01.out # netstat -an | grep 7077 tcp6 0 0 10.90.XX.XX:7077:::*LISTEN # On pmtx-master02 : # nc -v pmtx-master01 7077 pmtx-master01 [10.90.XX.XX] 7077 (?) open # ./spark-class org.apache.spark.deploy.worker.Worker spark://pmtx-master01:7077 13/11/22 10:57:50 INFO Slf4jEventHandler: Slf4jEventHandler started 13/11/22 10:57:50 INFO Worker: Starting Spark worker pmtx-master02:42271 with 8 cores, 22.6 GB RAM 13/11/22 10:57:50 INFO Worker: Spark home: /cluster/bin/spark 13/11/22 10:57:50 INFO WorkerWebUI: Started Worker web UI at http://pmtx-master02:8081 13/11/22 10:57:50 INFO Worker: Connecting to master spark://pmtx-master01:7077 13/11/22 10:57:50 ERROR Worker: Connection to master failed! Shutting down. # With spark-shell on pmtx-master02 : # MASTER=spark://pmtx-master01:7077 ./spark-shell Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 0.8.0 /_/ Using Scala version 2.9.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_31) Initializing interpreter... Creating SparkContext... 13/11/22 11:19:29 INFO Slf4jEventHandler: Slf4jEventHandler started 13/11/22 11:19:29 INFO SparkEnv: Registering BlockManagerMaster 13/11/22 11:19:29 INFO MemoryStore: MemoryStore started with capacity 323.9 MB. 13/11/22 11:19:29 INFO DiskStore: Created local directory at /tmp/spark-local-20131122111929-3e3c 13/11/22 11:19:29 INFO ConnectionManager: Bound socket to port 42249 with id = ConnectionManagerId(pmtx-master02,42249) 13/11/22 11:19:29 INFO BlockManagerMaster: Trying to register BlockManager 13/11/22 11:19:29 INFO BlockManagerMaster: Registered BlockManager 13/11/22 11:19:29 INFO HttpBroadcast: Broadcast server started at http://10.90.66.67:52531 13/11/22 11:19:29 INFO SparkEnv: Registering MapOutputTracker 13/11/22 11:19:29 INFO HttpFileServer: HTTP File server directory is /tmp/spark-40525f81-f883-45d5-92ad-bbff44ecf435 13/11/22 11:19:29 INFO SparkUI: Started Spark Web UI at http://pmtx-master02:4040 13/11/22 11:19:29 INFO Client$ClientActor: Connecting to master spark://pmtx-master01:7077 13/11/22 11:19:30 ERROR Client$ClientActor: Connection to master failed; stopping client 13/11/22 11:19:30 ERROR SparkDeploySchedulerBackend: Disconnected from Spark cluster! 13/11/22 11:19:30 ERROR ClusterScheduler: Exiting due to error from cluster scheduler: Disconnected from Spark cluster snip WORKING : Building HADOOP 2.0.5-alpha support On pmtx-master01, now im building hadoop 2.0.5-alpha : # sbt/sbt clean ... # SPARK_HADOOP_VERSION=2.0.5-alpha sbt/sbt assembly ... # ./bin/start-master.sh starting org.apache.spark.deploy.master.Master, logging to /cluster/bin/spark-0.8.0-incubating/bin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-pmtx-master01.out Same build on pmtx-master02 : # sbt/sbt clean ... build successfull ... # SPARK_HADOOP_VERSION=2.0.5-alpha sbt/sbt assembly ... build successfull ... # ./spark-class org.apache.spark.deploy.worker.Worker spark://pmtx-master01:7077 13/11/22 11:25:34 INFO Slf4jEventHandler: Slf4jEventHandler started 13/11/22 11:25:34 INFO Worker: Starting Spark worker pmtx-master02:33768 with 8 cores, 22.6 GB RAM 13/11/22 11:25:34 INFO Worker: Spark home: /cluster/bin/spark 13/11/22 11:25:34 INFO WorkerWebUI: Started Worker web UI at http://pmtx-master02:8081 13/11/22 11:25:34 INFO Worker: Connecting to master spark://pmtx-master01:7077 13/11/22 11:25:34 INFO Worker: Successfully registered with master # With spark-shell on pmtx-master02 : # MASTER=spark://pmtx-master01:7077 ./spark-shell Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 0.8.0 /_/ Using Scala version