[
https://issues.apache.org/jira/browse/TEZ-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rajesh Balamohan updated TEZ-4087:
----------------------------------
Summary: Shuffle: Fix shuffle cleanup to prevent thread leaks (was:
Shuffle: Check for thread's liveliness regularly to avoid infinite wait in
merger & referee threads)
> Shuffle: Fix shuffle cleanup to prevent thread leaks
> ----------------------------------------------------
>
> Key: TEZ-4087
> URL: https://issues.apache.org/jira/browse/TEZ-4087
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Rajesh Balamohan
> Priority: Major
>
> In certain cases, Shuffle's cleanupIgnoreErrors() is not called. This leaves
> 4 threads (inmem, diskmerger, Referee, ShuffleAndMergeRunner) run forever.
> When these are run in long running processes (e.g LLAP in Hive), they reach
> the thread limits over time.
> Note: Root cause why cleanupIgnoreErrors() is not invoked is not yet known. I
> will share the details when i get more details on this. Creating this ticket
> to add additional safety knobs to ensure that thread leaks do not happen.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)