Hi pedro, Apologies for not adding this earlier.
This is running on a local cluster set up as follows. JavaSparkContext jsc = new JavaSparkContext("local[2]", "DR"); Any suggestions based on this ? The ports are not blocked by firewall. Regards, On Sat, Jul 23, 2016 at 8:35 PM, Pedro Rodriguez <ski.rodrig...@gmail.com> wrote: > Make sure that you don’t have ports firewalled. You don’t really give much > information to work from, but it looks like the master can’t access the > worker nodes for some reason. If you give more information on the cluster, > networking, etc, it would help. > > For example, on AWS you can create a security group which allows all > traffic to/from itself to itself. If you are using something like ufw on > ubuntu then you probably need to know the ip addresses of the worker nodes > beforehand. > > — > Pedro Rodriguez > PhD Student in Large-Scale Machine Learning | CU Boulder > Systems Oriented Data Scientist > UC Berkeley AMPLab Alumni > > pedrorodriguez.io | 909-353-4423 > github.com/EntilZha | LinkedIn > <https://www.linkedin.com/in/pedrorodriguezscience> > > On July 23, 2016 at 7:38:01 AM, VG (vlin...@gmail.com) wrote: > > Please suggest if I am doing something wrong or an alternative way of > doing this. > > I have an RDD with two values as follows > JavaPairRDD<String, Long> rdd > > When I execute rdd..collectAsMap() > it always fails with IO exceptions. > > > 16/07/23 19:03:58 ERROR RetryingBlockFetcher: Exception while beginning > fetch of 1 outstanding blocks > java.io.IOException: Failed to connect to /192.168.1.3:58179 > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228) > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179) > at > org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:96) > at > org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140) > at > org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120) > at > org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:105) > at > org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:92) > at > org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:546) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:76) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1793) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:56) > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) > Caused by: java.net.ConnectException: Connection timed out: no further > information: /192.168.1.3:58179 > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source) > at > io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224) > at > io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > ... 1 more > 16/07/23 19:03:58 INFO RetryingBlockFetcher: Retrying fetch (1/3) for 1 > outstanding blocks after 5000 ms > > > >