[
https://issues.apache.org/jira/browse/SPARK-20579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994916#comment-15994916
]
yao zhang commented on SPARK-20579:
-----------------------------------
I recently use scala spark to do complex data processing on large data set, but
always run into hanged job with many active jobs/stages in web UI.
Here is details:
1. it only happens in spark-submit, when I test code in spark-shell, I never
has this issue.
2. I always use dynamic allocation.
3. My cluster is large enough for my task (I can use 2000 executors with mem
8G, driver with mem 100G).
4. The basic issue pattern is (from UI): some stages incomplete, then always
keep active => corresponding jobs incomplete and always active => active tasks
in executors accumulate => RDD blocks in executors accumulate => executors get
locked => application hanged and cannot move
> large spark job hang on with many active stages/jobs
> ----------------------------------------------------
>
> Key: SPARK-20579
> URL: https://issues.apache.org/jira/browse/SPARK-20579
> Project: Spark
> Issue Type: Bug
> Components: Spark Submit
> Affects Versions: 2.1.0
> Environment: spark 2.10 in hadoop (2.6.0) cluster
> Reporter: yao zhang
> Attachments: executor-screen.png, job-screen.png, stage-screen.png,
> storage-screen.png, thread-dump-screen.png
>
>
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]