[ 
https://issues.apache.org/jira/browse/FLINK-14565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963884#comment-16963884
 ] 

Zili Chen edited comment on FLINK-14565 at 10/31/19 11:14 AM:
--------------------------------------------------------------

Thanks for your insights. I agree that we don't manage this shutdown logic in 
{{MetricGroup}} level. For "managing the thread outside the metric system", I 
think lifecycle of {{SystemResourcesCounter}} thread is bounded to that of 
{{ProcessMetricGroup}}/{{TMMetricGroup}} so that there is a constraint we close 
{{SystemResourcesCounter}} on these MetricGroup closed.

You can take a look to the pull request attached that almost respect the 
comment above instead I keep the handle of {{SystemResourcesCounter}} in place 
inside {{ProcessMetricGroup}}/{{TMMetricGroup}}. Whether or not to introduce a 
mix-in style interface and let {{ProcessMetricGroup}}/{{TMMetricGroup}} 
implement it is a coin but as mentioned above, in my opinion lifecycle of 
{{SystemResourcesCounter}} thread is bounded to that of 
{{ProcessMetricGroup}}/{{TMMetricGroup}}. Even we do it said hold a {{Tuple2}} 
and close them always together it holds.


was (Author: tison):
Thanks for your insights. I agree that we don't manage this shutdown logic in 
{{MetricGroup}} level. For "managing the thread outside the metric system", I 
think lifecycle of {{SystemResourcesCounter}} thread is bound to that of 
{{ProcessMetricGroup}}/{{TMMetricGroup}} so that there is a constraint we close 
{{SystemResourcesCounter}} on these MetricGroup closed.

You can take a look to the pull request attached that almost respect the 
comment above instead I keep the handle of {{SystemResourcesCounter}} in place 
inside {{ProcessMetricGroup}}/{{TMMetricGroup}}. Whether or not to introduce a 
mix-in style interface and let {{ProcessMetricGroup}}/{{TMMetricGroup}} 
implement it is a coin but as mentioned above, in my opinion lifecycle of 
{{SystemResourcesCounter}} thread is bound to that of 
{{ProcessMetricGroup}}/{{TMMetricGroup}}. Even we do it said hold a {{Tuple2}} 
and close them always together it holds.

> Shutdown SystemResourcesCounter on (JM|TM)MetricGroup closed
> ------------------------------------------------------------
>
>                 Key: FLINK-14565
>                 URL: https://issues.apache.org/jira/browse/FLINK-14565
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Metrics
>            Reporter: Zili Chen
>            Assignee: Zili Chen
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, we start SystemResourcesCounter when initialize 
> (JM|TM)MetricGroup. This thread doesn't exit on (JM|TM)MetricGroup closed and 
> even there is not exit logic of them.
> It possibly causes thread leak. For example, on our platform which supports 
> previewing sample SQL execution, it starts a MiniCluster in the same process 
> as the platform. When the preview job finished MiniCluster closed and also 
> (JM|TM)MetricGroup. However these SystemResourcesCounter threads remain.
> I propose when creating SystemResourcesCounter, track it in 
> (JM|TM)MetricGroup, and on (JM|TM)MetricGroup closed, shutdown 
> SystemResourcesCounter. This way, we survive from thread leaks.
> CC [~chesnay] [~trohrmann]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to