Will Berkeley created KUDU-2805:
-----------------------------------
Summary: ClientTest.TestServerTooBusyRetry fails due to TSAN
thread limit
Key: KUDU-2805
URL: https://issues.apache.org/jira/browse/KUDU-2805
Project: Kudu
Issue Type: Bug
Affects Versions: 1.9.0
Reporter: Will Berkeley
I've seen a couple instances where ClientTest.TestServerTooBusyRetry fails
after hitting the TSAN thread limit, after seemingly being stuck for 10 minutes
or so. The end of the logs look like
{noformat}
W0428 12:20:07.406752 10297 debug-util.cc:397] Leaking SignalData structure
0x7b08000c2ba0 after lost signal to thread 8435
W0428 12:20:07.412693 10297 debug-util.cc:397] Leaking SignalData structure
0x7b080019f2a0 after lost signal to thread 10185
W0428 12:20:07.418191 10297 debug-util.cc:397] Leaking SignalData structure
0x7b080018f060 after lost signal to thread 10361
W0428 12:20:23.873589 10139 debug-util.cc:397] Leaking SignalData structure
0x7b08000fc360 after lost signal to thread 8435
W0428 12:20:23.878401 10139 debug-util.cc:397] Leaking SignalData structure
0x7b08000ccf20 after lost signal to thread 10185
W0428 12:20:23.884522 10139 debug-util.cc:397] Leaking SignalData structure
0x7b0800051ae0 after lost signal to thread 10361
W0428 12:22:03.715726 10297 debug-util.cc:397] Leaking SignalData structure
0x7b08000f9280 after lost signal to thread 8435
W0428 12:22:03.721261 10297 debug-util.cc:397] Leaking SignalData structure
0x7b08001b0e40 after lost signal to thread 10185
W0428 12:22:03.727725 10297 debug-util.cc:397] Leaking SignalData structure
0x7b08000b7460 after lost signal to thread 10361
W0428 12:22:11.928373 10139 debug-util.cc:397] Leaking SignalData structure
0x7b0800044be0 after lost signal to thread 8435
W0428 12:22:11.933187 10139 debug-util.cc:397] Leaking SignalData structure
0x7b080018f3c0 after lost signal to thread 10185
W0428 12:22:11.939275 10139 debug-util.cc:397] Leaking SignalData structure
0x7b08001b3480 after lost signal to thread 10361
==8432==ThreadSanitizer: Thread limit (8128 threads) exceeded. Dying.
{noformat}
Some threads are unresponsive, even to the signals sent by the stack trace
collector thread. Unfortunately, there's nothing in the logs about those
threads.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)