[jira] [Comment Edited] (SPARK-17954) FetchFailedException executor cannot connect to another worker executor
[ https://issues.apache.org/jira/browse/SPARK-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15581147#comment-15581147 ] Vitaly Gerasimov edited comment on SPARK-17954 at 10/17/16 8:03 AM: I figured out this issue. The problem is spark executor port listening localhost: {code} ~# netstat -ntlp tcp6 0 0 127.0.0.1:46721 :::*LISTEN 11294/java {code} Are there some changes in configuration that makes executor listen only localhost? When I run spark 1.6.2 executor listens any host. I know it may depends on my /etc/hosts file, but this change seems unexpectedly to me. 8081 ui port works fine: {code} ~# netstat -ntlp tcp6 0 0 :::8081 :::*LISTEN 11294/java {code} was (Author: v-gerasimov): I figured out this issue. The problem is spark executor port listening localhost: {code} ~# netstat -ntlp tcp6 0 0 127.0.0.1:46721 :::*LISTEN 11294/java {code} Are there some changes in configuration that makes executor listen only localhost? When I run spark 1.6.2 executor listens any host. I know it may depends on my /etc/hosts file, but this change seems unexpectedly to me. 8081 ui port works fine: {code} ~# netstat -ntlp tcp6 0 0 :::8081 :::*LISTEN 24095/java {code} > FetchFailedException executor cannot connect to another worker executor > --- > > Key: SPARK-17954 > URL: https://issues.apache.org/jira/browse/SPARK-17954 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1 >Reporter: Vitaly Gerasimov > > I have standalone mode spark cluster wich have three nodes: > master.test > worker1.test > worker2.test > I am trying to run the next code in spark shell: > {code} > val json = spark.read.json("hdfs://master.test/json/a.js.gz", > "hdfs://master.test/json/b.js.gz") > json.createOrReplaceTempView("messages") > spark.sql("select count(*) from messages").show() > {code} > and I am getting the following exception: > {code} > org.apache.spark.shuffle.FetchFailedException: Failed to connect to > worker1.test/x.x.x.x:51029 > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:357) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:54) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:246) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Failed to connect to > worker1.test/x.x.x.x:51029 > at > org.ap
[jira] [Comment Edited] (SPARK-17954) FetchFailedException executor cannot connect to another worker executor
[ https://issues.apache.org/jira/browse/SPARK-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15581147#comment-15581147 ] Vitaly Gerasimov edited comment on SPARK-17954 at 10/17/16 8:01 AM: I figured out this issue. The problem is spark executor port listening localhost: {code} ~# netstat -ntlp tcp6 0 0 127.0.0.1:46721 :::*LISTEN 11294/java {code} Are there some changes in configuration that makes executor listen only localhost? When I run spark 1.6.2 executor listens any host. I know it may depends on my /etc/hosts file, but this change seems unexpectedly to me. 8081 ui port works fine: {code} ~# netstat -ntlp tcp6 0 0 :::8081 :::*LISTEN 24095/java {code} was (Author: v-gerasimov): I figured out this issue. The problem is spark executor port listening localhost: {code} ~# netstat -ntlp tcp6 0 0 127.0.0.1:46721 :::*LISTEN 11294/java {code} Are there some changes in configuration that makes executor listen only localhost? When I run spark 1.6.2 executor listens any host. I know it may depends on my /etc/hosts file, but this change seems unexpectedly to me. > FetchFailedException executor cannot connect to another worker executor > --- > > Key: SPARK-17954 > URL: https://issues.apache.org/jira/browse/SPARK-17954 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1 >Reporter: Vitaly Gerasimov > > I have standalone mode spark cluster wich have three nodes: > master.test > worker1.test > worker2.test > I am trying to run the next code in spark shell: > {code} > val json = spark.read.json("hdfs://master.test/json/a.js.gz", > "hdfs://master.test/json/b.js.gz") > json.createOrReplaceTempView("messages") > spark.sql("select count(*) from messages").show() > {code} > and I am getting the following exception: > {code} > org.apache.spark.shuffle.FetchFailedException: Failed to connect to > worker1.test/x.x.x.x:51029 > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:357) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:54) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:246) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Failed to connect to > worker1.test/x.x.x.x:51029 > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228) > at > org.apache.spark.network.client.Transpor
[jira] [Comment Edited] (SPARK-17954) FetchFailedException executor cannot connect to another worker executor
[ https://issues.apache.org/jira/browse/SPARK-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15581147#comment-15581147 ] Vitaly Gerasimov edited comment on SPARK-17954 at 10/17/16 8:00 AM: I figured out this issue. The problem is spark executor port listening localhost: {code} ~# netstat -ntlp tcp6 0 0 127.0.0.1:46721 :::*LISTEN 11294/java {code} Are there some changes in configuration that makes executor listen only localhost? When I run spark 1.6.2 executor listens any host. I know it may depends on my /etc/hosts file, but this change seems unexpectedly to me. was (Author: v-gerasimov): I figured out this issue. The problem is spark executor port listening localhost: {code} ~# netstat -ntlp tcp6 0 0 127.0.0.1:46721 :::*LISTEN 11294/java {code} Are there some changes in configuration that makes executor listen only localhost? When I run spark 1.6.2 executor listens any port. I know it may depends on my /etc/hosts file, but this change seems unexpectedly to me. > FetchFailedException executor cannot connect to another worker executor > --- > > Key: SPARK-17954 > URL: https://issues.apache.org/jira/browse/SPARK-17954 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1 >Reporter: Vitaly Gerasimov > > I have standalone mode spark cluster wich have three nodes: > master.test > worker1.test > worker2.test > I am trying to run the next code in spark shell: > {code} > val json = spark.read.json("hdfs://master.test/json/a.js.gz", > "hdfs://master.test/json/b.js.gz") > json.createOrReplaceTempView("messages") > spark.sql("select count(*) from messages").show() > {code} > and I am getting the following exception: > {code} > org.apache.spark.shuffle.FetchFailedException: Failed to connect to > worker1.test/x.x.x.x:51029 > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:357) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:54) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:246) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Failed to connect to > worker1.test/x.x.x.x:51029 > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228) > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179) > at > org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAnd
[jira] [Comment Edited] (SPARK-17954) FetchFailedException executor cannot connect to another worker executor
[ https://issues.apache.org/jira/browse/SPARK-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15581147#comment-15581147 ] Vitaly Gerasimov edited comment on SPARK-17954 at 10/17/16 4:50 AM: I figured out this issue. The problem is spark executor port listening localhost: {code} ~# netstat -ntlp tcp6 0 0 127.0.0.1:46721 :::*LISTEN 11294/java {code} Are there some changes in configuration that makes executor listen only localhost? When I run spark 1.6.2 executor listens any port. I know it may depends on my /etc/hosts file, but this change seems unexpectedly to me. was (Author: v-gerasimov): I figured out this issue. The problem is spark executor port listening localhost: {code} ~# netstat -ntlp tcp6 0 0 127.0.0.1:46721 :::*LISTEN 11294/java {code} Are there some changes in configuration that makes executor listen only localhost? When I run spark 1.6.2 executor listens any port. > FetchFailedException executor cannot connect to another worker executor > --- > > Key: SPARK-17954 > URL: https://issues.apache.org/jira/browse/SPARK-17954 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1 >Reporter: Vitaly Gerasimov > > I have standalone mode spark cluster wich have three nodes: > master.test > worker1.test > worker2.test > I am trying to run the next code in spark shell: > {code} > val json = spark.read.json("hdfs://master.test/json/a.js.gz", > "hdfs://master.test/json/b.js.gz") > json.createOrReplaceTempView("messages") > spark.sql("select count(*) from messages").show() > {code} > and I am getting the following exception: > {code} > org.apache.spark.shuffle.FetchFailedException: Failed to connect to > worker1.test/x.x.x.x:51029 > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:357) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:54) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:246) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Failed to connect to > worker1.test/x.x.x.x:51029 > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228) > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179) > at > org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:96) > at > org.apache.spark.network.shuffle
[jira] [Comment Edited] (SPARK-17954) FetchFailedException executor cannot connect to another worker executor
[ https://issues.apache.org/jira/browse/SPARK-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15581147#comment-15581147 ] Vitaly Gerasimov edited comment on SPARK-17954 at 10/17/16 4:12 AM: I figured out this issue. The problem is spark executor port listening localhost: {code} ~# netstat -ntlp tcp6 0 0 127.0.0.1:46721 :::*LISTEN 11294/java {code} Are there some changes in configuration that makes executor listen only localhost? When I run spark 1.6.2 executor listens any port. was (Author: v-gerasimov): I figured out this issue. The problem is spark executor port listening localhost: {conf} ~# netstat -ntlp tcp6 0 0 127.0.0.1:46721 :::*LISTEN 11294/java {conf} Are there some changes in configuration that makes executor listen only localhost? When I run spark 1.6.2 executor listens any port. > FetchFailedException executor cannot connect to another worker executor > --- > > Key: SPARK-17954 > URL: https://issues.apache.org/jira/browse/SPARK-17954 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1 >Reporter: Vitaly Gerasimov > > I have standalone mode spark cluster wich have three nodes: > master.test > worker1.test > worker2.test > I am trying to run the next code in spark shell: > {code} > val json = spark.read.json("hdfs://master.test/json/a.js.gz", > "hdfs://master.test/json/b.js.gz") > json.createOrReplaceTempView("messages") > spark.sql("select count(*) from messages").show() > {code} > and I am getting the following exception: > {code} > org.apache.spark.shuffle.FetchFailedException: Failed to connect to > worker1.test/x.x.x.x:51029 > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:357) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:54) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:246) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Failed to connect to > worker1.test/x.x.x.x:51029 > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228) > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179) > at > org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:96) > at > org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140) > at > o