[ 
https://issues.apache.org/jira/browse/FLINK-14565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963809#comment-16963809
 ] 

Till Rohrmann commented on FLINK-14565:
---------------------------------------

Thanks for reporting this issue [~tison]. This sounds indeed like a bug which 
we should fix.

I'm not quite sure whether the {{MetricGroup}} should be responsible for 
managing the lifecycle of the monitoring thread. My concern is that we would 
need to change the interface of the {{MetricGroup}} just to accommodate for 
this one situation. An alternative solution could be to let 
{{SystemResourcesMetricsInitializer#instantiateSystemMetrics}} return a handle 
with which we can shut down the thread and the {{MetricGroup}} together (e.g. 
creating a composite of the {{MetricGroup}} and the {{Thread}} which is then 
kept in the {{ClusterEntrypoint}}).

> Shutdown SystemResourcesCounter on (JM|TM)MetricGroup closed
> ------------------------------------------------------------
>
>                 Key: FLINK-14565
>                 URL: https://issues.apache.org/jira/browse/FLINK-14565
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Metrics
>            Reporter: Zili Chen
>            Assignee: Zili Chen
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, we start SystemResourcesCounter when initialize 
> (JM|TM)MetricGroup. This thread doesn't exit on (JM|TM)MetricGroup closed and 
> even there is not exit logic of them.
> It possibly causes thread leak. For example, on our platform which supports 
> previewing sample SQL execution, it starts a MiniCluster in the same process 
> as the platform. When the preview job finished MiniCluster closed and also 
> (JM|TM)MetricGroup. However these SystemResourcesCounter threads remain.
> I propose when creating SystemResourcesCounter, track it in 
> (JM|TM)MetricGroup, and on (JM|TM)MetricGroup closed, shutdown 
> SystemResourcesCounter. This way, we survive from thread leaks.
> CC [~chesnay] [~trohrmann]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to