Idan Zalzberg created SPARK-1394:
------------------------------------
Summary: calling system.platform on worker raises exception
Key: SPARK-1394
URL: https://issues.apache.org/jira/browse/SPARK-1394
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 0.9.0
Environment: Tested on Ubuntu and Linux, local and remote master,
python 2.7.*
Reporter: Idan Zalzberg
A simple program that calls system.platform() on the worker fails most of the
time (it works some times but very rarely).
This is critical since many libraries call that method (e.g. boto).
Here is the trace of the attempt to call that method:
$ /usr/local/spark/bin/pyspark
Python 2.7.3 (default, Feb 27 2014, 20:00:17)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
14/04/02 18:18:37 INFO Utils: Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
14/04/02 18:18:37 WARN Utils: Your hostname, qlika-dev resolves to a loopback
address: 127.0.1.1; using 10.33.102.46 instead (on interface eth1)
14/04/02 18:18:37 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another
address
14/04/02 18:18:38 INFO Slf4jLogger: Slf4jLogger started
14/04/02 18:18:38 INFO Remoting: Starting remoting
14/04/02 18:18:39 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://[email protected]:36640]
14/04/02 18:18:39 INFO Remoting: Remoting now listens on addresses:
[akka.tcp://[email protected]:36640]
14/04/02 18:18:39 INFO SparkEnv: Registering BlockManagerMaster
14/04/02 18:18:39 INFO DiskBlockManager: Created local directory at
/tmp/spark-local-20140402181839-919f
14/04/02 18:18:39 INFO MemoryStore: MemoryStore started with capacity 294.6 MB.
14/04/02 18:18:39 INFO ConnectionManager: Bound socket to port 43357 with id =
ConnectionManagerId(10.33.102.46,43357)
14/04/02 18:18:39 INFO BlockManagerMaster: Trying to register BlockManager
14/04/02 18:18:39 INFO BlockManagerMasterActor$BlockManagerInfo: Registering
block manager 10.33.102.46:43357 with 294.6 MB RAM
14/04/02 18:18:39 INFO BlockManagerMaster: Registered BlockManager
14/04/02 18:18:39 INFO HttpServer: Starting HTTP Server
14/04/02 18:18:39 INFO HttpBroadcast: Broadcast server started at
http://10.33.102.46:51803
14/04/02 18:18:39 INFO SparkEnv: Registering MapOutputTracker
14/04/02 18:18:39 INFO HttpFileServer: HTTP File server directory is
/tmp/spark-9b38acb0-7b01-4463-b0a6-602bfed05a2b
14/04/02 18:18:39 INFO HttpServer: Starting HTTP Server
14/04/02 18:18:40 INFO SparkUI: Started Spark Web UI at http://10.33.102.46:4040
14/04/02 18:18:40 WARN NativeCodeLoader: Unable to load native-hadoop library
for your platform... using builtin-java classes where applicable
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 0.9.0
/_/
Using Python version 2.7.3 (default, Feb 27 2014 20:00:17)
Spark context available as sc.
>>> import platform
>>> sc.parallelize([1]).map(lambda x : platform.system()).collect()
14/04/02 18:19:17 INFO SparkContext: Starting job: collect at <stdin>:1
14/04/02 18:19:17 INFO DAGScheduler: Got job 0 (collect at <stdin>:1) with 1
output partitions (allowLocal=false)
14/04/02 18:19:17 INFO DAGScheduler: Final stage: Stage 0 (collect at <stdin>:1)
14/04/02 18:19:17 INFO DAGScheduler: Parents of final stage: List()
14/04/02 18:19:17 INFO DAGScheduler: Missing parents: List()
14/04/02 18:19:17 INFO DAGScheduler: Submitting Stage 0 (PythonRDD[1] at
collect at <stdin>:1), which has no missing parents
14/04/02 18:19:17 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0
(PythonRDD[1] at collect at <stdin>:1)
14/04/02 18:19:17 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
14/04/02 18:19:17 INFO TaskSetManager: Starting task 0.0:0 as TID 0 on executor
localhost: localhost (PROCESS_LOCAL)
14/04/02 18:19:17 INFO TaskSetManager: Serialized task 0.0:0 as 2152 bytes in
12 ms
14/04/02 18:19:17 INFO Executor: Running task ID 0
PySpark worker failed with exception:
Traceback (most recent call last):
File "/usr/local/spark/python/pyspark/worker.py", line 77, in main
serializer.dump_stream(func(split_index, iterator), outfile)
File "/usr/local/spark/python/pyspark/serializers.py", line 182, in
dump_stream
self.serializer.dump_stream(self._batched(iterator), stream)
File "/usr/local/spark/python/pyspark/serializers.py", line 117, in
dump_stream
for obj in iterator:
File "/usr/local/spark/python/pyspark/serializers.py", line 171, in _batched
for item in iterator:
File "<stdin>", line 1, in <lambda>
File "/usr/lib/python2.7/platform.py", line 1306, in system
return uname()[0]
File "/usr/lib/python2.7/platform.py", line 1273, in uname
processor = _syscmd_uname('-p','')
File "/usr/lib/python2.7/platform.py", line 1030, in _syscmd_uname
rc = f.close()
IOError: [Errno 10] No child processes
14/04/02 18:19:17 ERROR Executor: Exception in task ID 0
org.apache.spark.api.python.PythonException: Traceback (most recent call last):
File "/usr/local/spark/python/pyspark/worker.py", line 77, in main
serializer.dump_stream(func(split_index, iterator), outfile)
File "/usr/local/spark/python/pyspark/serializers.py", line 182, in
dump_stream
self.serializer.dump_stream(self._batched(iterator), stream)
File "/usr/local/spark/python/pyspark/serializers.py", line 117, in
dump_stream
for obj in iterator:
File "/usr/local/spark/python/pyspark/serializers.py", line 171, in _batched
for item in iterator:
File "<stdin>", line 1, in <lambda>
File "/usr/lib/python2.7/platform.py", line 1306, in system
return uname()[0]
File "/usr/lib/python2.7/platform.py", line 1273, in uname
processor = _syscmd_uname('-p','')
File "/usr/lib/python2.7/platform.py", line 1030, in _syscmd_uname
rc = f.close()
IOError: [Errno 10] No child processes
at
org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:131)
at
org.apache.spark.api.python.PythonRDD$$anon$1.<init>(PythonRDD.scala:153)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:96)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:109)
at org.apache.spark.scheduler.Task.run(Task.scala:53)
at
org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
at
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
--
This message was sent by Atlassian JIRA
(v6.2#6252)