[ 
https://issues.apache.org/jira/browse/FLINK-28982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LI Mingkun updated FLINK-28982:
-------------------------------
    Description: 
Task will start TaskInterrupter only when `ExecutionState` is INITIALIZING or 
RUNNING in the function: 
org.apache.flink.runtime.taskmanager.Task#cancelOrFailAndCancelInvokableInternal

 

I got a dead lock in multi task which caused by Flink Remote Shuffle's sharing 
TCP connection bug and blocked tasks destruction when I use Flink Remote 
Shuffle.

stack as following:

!image-2022-08-16-12-10-43-894.png!

My question: Why not start the TaskInterrupter when cancel a deploying task?

 

  was:
Task will start TaskInterrupter only when `ExecutionState` is INITIALIZING or 
RUNNING in the function: 
org.apache.flink.runtime.taskmanager.Task#cancelOrFailAndCancelInvokableInternal

 

I met a dead lock in multi task which caused by Flink Remote Shuffle's sharing 
TCP connection bug and blocked tasks destruction when I use Flink Remote 
Shuffle.

stack as following:

!image-2022-08-16-12-10-43-894.png!

My question: Why not start the TaskInterrupter when cancel a deploying task?

 


> Start TaskInterrupter when task switch from DEPLOYING to CANCELING
> ------------------------------------------------------------------
>
>                 Key: FLINK-28982
>                 URL: https://issues.apache.org/jira/browse/FLINK-28982
>             Project: Flink
>          Issue Type: Technical Debt
>          Components: Runtime / Web Frontend
>            Reporter: LI Mingkun
>            Priority: Major
>         Attachments: image-2022-08-16-12-10-43-894.png
>
>
> Task will start TaskInterrupter only when `ExecutionState` is INITIALIZING or 
> RUNNING in the function: 
> org.apache.flink.runtime.taskmanager.Task#cancelOrFailAndCancelInvokableInternal
>  
> I got a dead lock in multi task which caused by Flink Remote Shuffle's 
> sharing TCP connection bug and blocked tasks destruction when I use Flink 
> Remote Shuffle.
> stack as following:
> !image-2022-08-16-12-10-43-894.png!
> My question: Why not start the TaskInterrupter when cancel a deploying task?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to