[
https://issues.apache.org/jira/browse/SPARK-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Davies Liu resolved SPARK-1284.
-------------------------------
Resolution: Fixed
Fix Version/s: 1.1.0
I think this is an logging issue ,should be fixed by
https://github.com/apache/spark/pull/1625, so close it.
If anyone meet this again, we can reopen it.
> pyspark hangs after IOError on Executor
> ---------------------------------------
>
> Key: SPARK-1284
> URL: https://issues.apache.org/jira/browse/SPARK-1284
> Project: Spark
> Issue Type: Bug
> Components: PySpark
> Reporter: Jim Blomo
> Assignee: Davies Liu
> Fix For: 1.1.0
>
>
> When running a reduceByKey over a cached RDD, Python fails with an exception,
> but the failure is not detected by the task runner. Spark and the pyspark
> shell hang waiting for the task to finish.
> The error is:
> {code}
> PySpark worker failed with exception:
> Traceback (most recent call last):
> File "/home/hadoop/spark/python/pyspark/worker.py", line 77, in main
> serializer.dump_stream(func(split_index, iterator), outfile)
> File "/home/hadoop/spark/python/pyspark/serializers.py", line 182, in
> dump_stream
> self.serializer.dump_stream(self._batched(iterator), stream)
> File "/home/hadoop/spark/python/pyspark/serializers.py", line 118, in
> dump_stream
> self._write_with_length(obj, stream)
> File "/home/hadoop/spark/python/pyspark/serializers.py", line 130, in
> _write_with_length
> stream.write(serialized)
> IOError: [Errno 104] Connection reset by peer
> 14/03/19 22:48:15 INFO scheduler.TaskSetManager: Serialized task 4.0:0 as
> 4257 bytes in 47 ms
> Traceback (most recent call last):
> File "/home/hadoop/spark/python/pyspark/daemon.py", line 117, in
> launch_worker
> worker(listen_sock)
> File "/home/hadoop/spark/python/pyspark/daemon.py", line 107, in worker
> outfile.flush()
> IOError: [Errno 32] Broken pipe
> {code}
> I can reproduce the error by running take(10) on the cached RDD before
> running reduceByKey (which looks at the whole input file).
> Affects Version 1.0.0-SNAPSHOT (4d88030486)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]