[ 
https://issues.apache.org/jira/browse/SPARK-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093137#comment-14093137
 ] 

Davies Liu commented on SPARK-1284:
-----------------------------------

[~jblomo], could you reproduce this on master or 1.1 branch?

Maybe the pyspark did not hange after this error message, the take() had 
finished successfully before the error message pop up. The noisy error messages 
had been fixed in PR https://github.com/apache/spark/pull/1625 

> pyspark hangs after IOError on Executor
> ---------------------------------------
>
>                 Key: SPARK-1284
>                 URL: https://issues.apache.org/jira/browse/SPARK-1284
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>            Reporter: Jim Blomo
>            Assignee: Davies Liu
>
> When running a reduceByKey over a cached RDD, Python fails with an exception, 
> but the failure is not detected by the task runner.  Spark and the pyspark 
> shell hang waiting for the task to finish.
> The error is:
> {code}
> PySpark worker failed with exception:
> Traceback (most recent call last):
>   File "/home/hadoop/spark/python/pyspark/worker.py", line 77, in main
>     serializer.dump_stream(func(split_index, iterator), outfile)
>   File "/home/hadoop/spark/python/pyspark/serializers.py", line 182, in 
> dump_stream
>     self.serializer.dump_stream(self._batched(iterator), stream)
>   File "/home/hadoop/spark/python/pyspark/serializers.py", line 118, in 
> dump_stream
>     self._write_with_length(obj, stream)
>   File "/home/hadoop/spark/python/pyspark/serializers.py", line 130, in 
> _write_with_length
>     stream.write(serialized)
> IOError: [Errno 104] Connection reset by peer
> 14/03/19 22:48:15 INFO scheduler.TaskSetManager: Serialized task 4.0:0 as 
> 4257 bytes in 47 ms
> Traceback (most recent call last):
>   File "/home/hadoop/spark/python/pyspark/daemon.py", line 117, in 
> launch_worker
>     worker(listen_sock)
>   File "/home/hadoop/spark/python/pyspark/daemon.py", line 107, in worker
>     outfile.flush()
> IOError: [Errno 32] Broken pipe
> {code}
> I can reproduce the error by running take(10) on the cached RDD before 
> running reduceByKey (which looks at the whole input file).
> Affects Version 1.0.0-SNAPSHOT (4d88030486)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to