[
https://issues.apache.org/jira/browse/FLINK-14565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963809#comment-16963809
]
Till Rohrmann commented on FLINK-14565:
---------------------------------------
Thanks for reporting this issue [~tison]. This sounds indeed like a bug which
we should fix.
I'm not quite sure whether the {{MetricGroup}} should be responsible for
managing the lifecycle of the monitoring thread. My concern is that we would
need to change the interface of the {{MetricGroup}} just to accommodate for
this one situation. An alternative solution could be to let
{{SystemResourcesMetricsInitializer#instantiateSystemMetrics}} return a handle
with which we can shut down the thread and the {{MetricGroup}} together (e.g.
creating a composite of the {{MetricGroup}} and the {{Thread}} which is then
kept in the {{ClusterEntrypoint}}).
> Shutdown SystemResourcesCounter on (JM|TM)MetricGroup closed
> ------------------------------------------------------------
>
> Key: FLINK-14565
> URL: https://issues.apache.org/jira/browse/FLINK-14565
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Metrics
> Reporter: Zili Chen
> Assignee: Zili Chen
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently, we start SystemResourcesCounter when initialize
> (JM|TM)MetricGroup. This thread doesn't exit on (JM|TM)MetricGroup closed and
> even there is not exit logic of them.
> It possibly causes thread leak. For example, on our platform which supports
> previewing sample SQL execution, it starts a MiniCluster in the same process
> as the platform. When the preview job finished MiniCluster closed and also
> (JM|TM)MetricGroup. However these SystemResourcesCounter threads remain.
> I propose when creating SystemResourcesCounter, track it in
> (JM|TM)MetricGroup, and on (JM|TM)MetricGroup closed, shutdown
> SystemResourcesCounter. This way, we survive from thread leaks.
> CC [~chesnay] [~trohrmann]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)