Has anyone encountered this “port out of range” error when launching PySpark 
jobs on YARN?  It is sporadic (e.g. 2/3 jobs get this error).

LOG:

15/06/19 11:49:44 INFO scheduler.TaskSetManager: Lost task 0.3 in stage 39.0 
(TID 211) on executor xxx.xxx.xxx.com <http://xxx.xxx.xxx.com/>: 
java.lang.IllegalArgumentException (port out of range:1315905645) [duplicate 7]
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
15/06/19 11:49:44 INFO cluster.YarnScheduler: Removed TaskSet 39.0, whose tasks 
have all completed, from pool
 File "/home/john/spark-1.4.0/python/pyspark/rdd.py", line 745, in collect
   port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
 File 
"/home/john/spark-1.4.0/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", 
line 538, in __call__
 File 
"/home/john/spark-1.4.0/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 
300, in get_return_value
py4j.protocol.Py4JJavaError15/06/19 11:49:44 INFO storage.BlockManagerInfo: 
Removed broadcast_38_piece0 on 17.134.160.35:47455 in memory (size: 2.2 KB, 
free: 265.4 MB)
: An error occurred while calling 
z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in 
stage 39.0 failed 4 times, most recent failure: Lost task 1.3 in stage 39.0 
(TID 210, xxx.xxx.xxx.com <http://xxx.xxx.xxx.com/>): 
java.lang.IllegalArgumentException: port out of range:1315905645
        at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
        at java.net.InetSocketAddress.<init>(InetSocketAddress.java:185)
        at java.net.Socket.<init>(Socket.java:241)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:75)
        at 
org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:90)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:130)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:73)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
        at org.apache.spark.scheduler.Task.run(Task.scala:70)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
        at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1266)
        at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1257)
        at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1256)
        at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1256)
        at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
        at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
        at scala.Option.foreach(Option.scala:236)
        at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
        at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1450)
        at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1411)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

* Spark 1.4.0 build: 

build/mvn -Pyarn -Phive -Phadoop-2.3 -Dhadoop.version=2.3.0-cdh5.1.4 
-DskipTests clean package

LAUNCH CMD:

export HADOOP_CONF_DIR=/path/to/conf
export PYSPARK_PYTHON=/path/to/python-2.7.2/bin/python
~/spark-1.4.0/bin/pyspark \
--conf 
spark.yarn.jar=/home/john/spark-1.4.0/assembly/target/scala-2.10/spark-assembly-1.4.0-hadoop2.3.0-cdh5.1.4.jar
 \
--master yarn-client \
--num-executors 3 \
--executor-cores 18 \
--executor-memory 48g

TEST JOB IN REPL:

words = [‘hi’, ‘there’, ‘yo’, ‘baby’]
wordsRdd = sc.parallelize(words)
words.map(lambda x: (x,1)).collect()

Reply via email to