Rajesh Balamohan created TEZ-4087:
-------------------------------------
Summary: Shuffle: Check for thread's liveliness regularly to avoid
infinite wait in merger & referee threads
Key: TEZ-4087
URL: https://issues.apache.org/jira/browse/TEZ-4087
Project: Apache Tez
Issue Type: Bug
Reporter: Rajesh Balamohan
In certain cases, Shuffle's cleanupIgnoreErrors() is not called. This leaves 4
threads (inmem, diskmerger, Referee, ShuffleAndMergeRunner) run forever.
When these are run in long running processes (e.g LLAP in Hive), they reach the
thread limits over time.
Note: Root cause why cleanupIgnoreErrors() is not invoked is not yet known. I
will share the details when i get more details on this. This ticket is created
as a add-on safety so that thread leaks do not happen.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)