[ 
https://issues.apache.org/jira/browse/SPARK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147513#comment-14147513
 ] 

Ziv Huang edited comment on SPARK-3687 at 9/25/14 8:36 AM:
-----------------------------------------------------------

Just a few mins ago I ran a job twice, processing 203 sequence files.
Both times I saw the job hanging with different behavior than before: 
1. the web UI of spark master shows that the job is finished with state 
"failed" after 3.x mins
2. the job stage web UI still hangs, and execution duration time is still 
accumulating.
Hope this information helps debugging :)


was (Author: taqilabon):
Just a few mins ago I ran a job twice, processing 203 sequence files.
Both times I saw the job hanging with different behavior from before: 
1. the web UI of spark master shows that the job is finished with state 
"failed" after 3.x mins
2. the job stage web UI still hangs, and execution duration time is still 
accumulating.
Hope this information helps debugging :)

> Spark hang while processing more than 100 sequence files
> --------------------------------------------------------
>
>                 Key: SPARK-3687
>                 URL: https://issues.apache.org/jira/browse/SPARK-3687
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.2, 1.1.0
>            Reporter: Ziv Huang
>
> In my application, I read more than 100 sequence files to a JavaPairRDD, 
> perform flatmap to get another JavaRDD, and then use takeOrdered to get the 
> result.
> It is quite often (but not always) that the spark hangs while the executing 
> some of 110th-130th tasks.
> The job can hang for several hours, maybe forever (I can't wait for its 
> completion).
> When the spark job hangs, I can't find any error message in anywhere, and I 
> can't kill the job from web UI.
> The current workaround is to use coalesce to reduce the number of partitions 
> to be processed.
> I never get a job hanged if the number of partitions to be processed is no 
> greater than 80.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to