Hi I’ve got working cluster for more, than couple of weeks with 20 workers. Everything was perfect. Today I added 4 more workers and all of them couldn’t fetch jar files from master.
The following means to me that master is available to worker, it is registered there and it started everything. But when Executor got assigned task, it couldn’t download jar file from master. 16/12/01 06:25:36 INFO CoarseGrainedExecutorBackend: Connecting to driver: spark://coarsegrainedschedu...@xx.xx.xx.xxx:47783 <spark://coarsegrainedschedu...@xx.xx.xx.xxx:47783> 16/12/01 06:25:36 INFO WorkerWatcher: Connecting to worker spark://wor...@xx.xx.xx.xx:46799 <spark://wor...@xx.xx.xx.xx:46799> 16/12/01 06:25:36 INFO CoarseGrainedExecutorBackend: Successfully registered with driver 16/12/01 06:25:36 INFO Executor: Starting executor ID 304 on host machine059.company.com <http://machine059.company.com/> 16/12/01 06:25:36 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 48 844. 16/12/01 06:25:36 INFO NettyBlockTransferService: Server created on 48844 16/12/01 06:25:36 INFO BlockManagerMaster: Trying to register BlockManager 16/12/01 06:25:36 INFO BlockManagerMaster: Registered BlockManager 16/12/01 06:25:44 INFO CoarseGrainedExecutorBackend: Got assigned task 9239511 16/12/01 06:25:44 INFO Executor: Running task 46.0 in stage 159585.0 (TID 9239511) 16/12/01 06:25:44 INFO Executor: Fetching http://xx.xx.xx.xxx:56027/jars/sparkws-core-1.0.0-20161118.151127-286.jar <http://xx.xx.xx.xxx:56027/jars/sparkws-core-1.0.0-20161118.151127-286.jar> with timesta mp 1479534760567 16/12/01 06:25:44 ERROR Executor: Exception in task 46.0 in stage 159585.0 (TID 9239511) java.io.FileNotFoundException: http://xx.xx.xx.xxx:56027/jars/sparkws-core-1.0.0-20161118.151127-286.jar <http://xx.xx.xx.xxx:56027/jars/sparkws-core-1.0.0-20161118.151127-286.jar> When I issue the same download manually I’m getting simple file not found. $ curl "http://xx.xx.xx.xxx:56027/jars/sparkws-spark-1.0.0-20161118.151127-286.jar <http://xx.xx.xx.xxx:56027/jars/sparkws-spark-1.0.0-20161118.151127-286.jar>" <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 404 Not Found</title> </head> <body> <h2>HTTP ERROR: 404</h2> <p>Problem accessing /jars/sparkws-spark-1.0.0-20161118.151127-286.jar. Reason: <pre> Not Found</pre></p> <hr /><i><small>Powered by Jetty://</small></i> Spark Environment tells jars are there spark.jars /opt/sparkws/lib/sparkws-spark-1.0.0-20161118.151127-286.jar,/opt/sparkws/lib/sparkws-core-1.0.0-20161118.151127-286.jar And they are indeed there [user@machine004]$ ll /opt/sparkws/lib/ | grep sparkws 156455 Nov 18 15:12 sparkws-core-1.0.0-20161118.151127-286.jar 18944178 Nov 18 15:14 sparkws-spark-1.0.0-20161118.151127-286.jar Anyone might know what might be an issue here?