[ 
https://issues.apache.org/jira/browse/SPARK-39283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Rosen updated SPARK-39283:
-------------------------------
    Fix Version/s: 3.0.4
                   3.1.4
                   3.2.2
                   3.3.1

> Spark tasks stuck forever due to deadlock between TaskMemoryManager and 
> UnsafeExternalSorter
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-39283
>                 URL: https://issues.apache.org/jira/browse/SPARK-39283
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.0.0, 3.1.2
>            Reporter: Sandeep Pal
>            Assignee: Sandeep Pal
>            Priority: Critical
>              Labels: Deadlock, spark3.0
>             Fix For: 3.0.4, 3.1.4, 3.2.2, 3.3.1
>
>         Attachments: DeadlockSparkTasks.png
>
>
> We are seems this deadlock between {{TaskMemoryManager}} and 
> {{UnsafeExternalSorter}} pretty often on our workload. Sometime, the retry is 
> successful but sometimes we have to do hacky ways to break the deadlocks such 
> as turning down the worker machines explicitly. 
> Below is the thread dump from the Spark UI showing the deadlock :
> !DeadlockSparkTasks.png!
>  
> I believe there was a related Jira on the similar deadlock between the same 
> threads and it was resolved. 
> https://issues.apache.org/jira/browse/SPARK-27338
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to