I upgraded our Spark standalone cluster from 1.4.1 to 1.6.0 yesterday. We are 
now seeing regular timeouts between two of the workers when making connections. 
These workers and the same driver code worked fine running on 1.4.1 and 
finished in under a second. Any thoughts on what might have changed?

16/01/06 19:17:58 ERROR RetryingBlockFetcher: Exception while beginning fetch 
of 1 outstanding blocks (after 3 retries)
java.io.IOException: Connecting to /10.248.0.218:52104 timed out (120000 ms)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:214)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
at 
org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:90)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
16/01/06 19:17:58 WARN BlockManager: Failed to fetch remote block rdd_74_3 from 
BlockManagerId(1, 10.248.0.218, 52104) (failed attempt 1)
java.io.IOException: Connecting to /10.248.0.218:52104 timed out (120000 ms)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:214)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
at 
org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:90)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)



Thanks,
Jeff





This message (and any attachments) is intended only for the designated 
recipient(s). It
may contain confidential or proprietary information, or have other limitations 
on use as
indicated by the sender. If you are not a designated recipient, you may not 
review, use,
copy or distribute this message. If you received this in error, please notify 
the sender by
reply e-mail and delete this message.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to