[
https://issues.apache.org/jira/browse/FLINK-14565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963884#comment-16963884
]
Zili Chen edited comment on FLINK-14565 at 10/31/19 11:14 AM:
--------------------------------------------------------------
Thanks for your insights. I agree that we don't manage this shutdown logic in
{{MetricGroup}} level. For "managing the thread outside the metric system", I
think lifecycle of {{SystemResourcesCounter}} thread is bounded to that of
{{ProcessMetricGroup}}/{{TMMetricGroup}} so that there is a constraint we close
{{SystemResourcesCounter}} on these MetricGroup closed.
You can take a look to the pull request attached that almost respect the
comment above instead I keep the handle of {{SystemResourcesCounter}} in place
inside {{ProcessMetricGroup}}/{{TMMetricGroup}}. Whether or not to introduce a
mix-in style interface and let {{ProcessMetricGroup}}/{{TMMetricGroup}}
implement it is a coin but as mentioned above, in my opinion lifecycle of
{{SystemResourcesCounter}} thread is bounded to that of
{{ProcessMetricGroup}}/{{TMMetricGroup}}. Even we do it said hold a {{Tuple2}}
and close them always together it holds.
was (Author: tison):
Thanks for your insights. I agree that we don't manage this shutdown logic in
{{MetricGroup}} level. For "managing the thread outside the metric system", I
think lifecycle of {{SystemResourcesCounter}} thread is bound to that of
{{ProcessMetricGroup}}/{{TMMetricGroup}} so that there is a constraint we close
{{SystemResourcesCounter}} on these MetricGroup closed.
You can take a look to the pull request attached that almost respect the
comment above instead I keep the handle of {{SystemResourcesCounter}} in place
inside {{ProcessMetricGroup}}/{{TMMetricGroup}}. Whether or not to introduce a
mix-in style interface and let {{ProcessMetricGroup}}/{{TMMetricGroup}}
implement it is a coin but as mentioned above, in my opinion lifecycle of
{{SystemResourcesCounter}} thread is bound to that of
{{ProcessMetricGroup}}/{{TMMetricGroup}}. Even we do it said hold a {{Tuple2}}
and close them always together it holds.
> Shutdown SystemResourcesCounter on (JM|TM)MetricGroup closed
> ------------------------------------------------------------
>
> Key: FLINK-14565
> URL: https://issues.apache.org/jira/browse/FLINK-14565
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Metrics
> Reporter: Zili Chen
> Assignee: Zili Chen
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently, we start SystemResourcesCounter when initialize
> (JM|TM)MetricGroup. This thread doesn't exit on (JM|TM)MetricGroup closed and
> even there is not exit logic of them.
> It possibly causes thread leak. For example, on our platform which supports
> previewing sample SQL execution, it starts a MiniCluster in the same process
> as the platform. When the preview job finished MiniCluster closed and also
> (JM|TM)MetricGroup. However these SystemResourcesCounter threads remain.
> I propose when creating SystemResourcesCounter, track it in
> (JM|TM)MetricGroup, and on (JM|TM)MetricGroup closed, shutdown
> SystemResourcesCounter. This way, we survive from thread leaks.
> CC [~chesnay] [~trohrmann]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)