[ 
https://issues.apache.org/jira/browse/FLINK-16416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yangze Guo updated FLINK-16416:
-------------------------------
    Description: 
Recently, I try to add a new {{GPUManager}} to the {{TaskExecutorServices}}. I 
register the "GPUManager#close" function, in which I write some cleanup logic, 
to the {{TaskExecutorServices#shutDown}}. However, I found that the cleanup 
logic does not run as expected in standalone mode.
 After an investigation in the codebase, I found that the 
{{TaskExecutorServices#shutDown}} will be called only on a fatal error while we 
just kill the TM process in the {{flink-daemon.sh}}. However, the LOG shows 
that some services, e.g. TaskExecutorLocalStateStoresManager, did clean up 
themselves by registering {{shutdownHook}}.
 If that is the right way, then we need to register a {{shutdownHook}} for 
{{TaskExecutorServices}} as well.
 If that is not, we may find another solution to shutdown TM gracefully.

  was:
Recently, I try to add a new {{GPUManager}} to the {{TaskExecutorServices}}. I 
register the "close", in which I write some cleanup logic, function to the 
{{TaskExecutorServices#shutDown}}. However, I found that the cleanup logic does 
not run as expected in standalone mode.
After an investigation in the codebase, I found that the 
{{TaskExecutorServices#shutDown}} will be called only on a fatal error while we 
just kill the TM process in the {{flink-daemon.sh}}. However, the LOG shows 
that some services did clean up themselves by registering {{shutdownHook}}.
If that is the right way, then we need to register a {{shutdownHook}} for 
{{TaskExecutorServices}} as well.
If that is not, we may find another solution to shutdown TM gracefully.


> Shutdown the task manager gracefully in standalone mode
> -------------------------------------------------------
>
>                 Key: FLINK-16416
>                 URL: https://issues.apache.org/jira/browse/FLINK-16416
>             Project: Flink
>          Issue Type: Improvement
>          Components: Command Line Client
>            Reporter: Yangze Guo
>            Priority: Major
>
> Recently, I try to add a new {{GPUManager}} to the {{TaskExecutorServices}}. 
> I register the "GPUManager#close" function, in which I write some cleanup 
> logic, to the {{TaskExecutorServices#shutDown}}. However, I found that the 
> cleanup logic does not run as expected in standalone mode.
>  After an investigation in the codebase, I found that the 
> {{TaskExecutorServices#shutDown}} will be called only on a fatal error while 
> we just kill the TM process in the {{flink-daemon.sh}}. However, the LOG 
> shows that some services, e.g. TaskExecutorLocalStateStoresManager, did clean 
> up themselves by registering {{shutdownHook}}.
>  If that is the right way, then we need to register a {{shutdownHook}} for 
> {{TaskExecutorServices}} as well.
>  If that is not, we may find another solution to shutdown TM gracefully.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to