[kudu-CR] KUDU-2215. kernel stack watchdog: avoid blocking thread exit
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8536 ) Change subject: KUDU-2215. kernel_stack_watchdog: avoid blocking thread exit .. KUDU-2215. kernel_stack_watchdog: avoid blocking thread exit This changes the stack watchdog so that thread unregistration no longer blocks if the watchdog thread is in the middle of dumping a stack. This is to try to avoid cases where a user thread is waiting to join on another thread, but that thread is blocked due to watchdog interference. A new stress-test/benchmark verifies the improvement. It simulates slow stack trace collection by injecting latency into the watchdog thread, and then starts and joins threads in a loop for 5 seconds. Without the fix, it was only able to start about 1000 threads/second, whereas with the fix it's able to start 10,000 threads/second. Change-Id: Ib6a349666e8484c00b2f43c5918205ec1a4c09ab Reviewed-on: http://gerrit.cloudera.org:8080/8536 Tested-by: Kudu Jenkins Reviewed-by: Andrew Wong--- M src/kudu/util/fault_injection.cc M src/kudu/util/fault_injection.h M src/kudu/util/kernel_stack_watchdog.cc M src/kudu/util/kernel_stack_watchdog.h M src/kudu/util/stack_watchdog-test.cc 5 files changed, 149 insertions(+), 21 deletions(-) Approvals: Kudu Jenkins: Verified Andrew Wong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/8536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ib6a349666e8484c00b2f43c5918205ec1a4c09ab Gerrit-Change-Number: 8536 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2215. kernel stack watchdog: avoid blocking thread exit
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8536 ) Change subject: KUDU-2215. kernel_stack_watchdog: avoid blocking thread exit .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib6a349666e8484c00b2f43c5918205ec1a4c09ab Gerrit-Change-Number: 8536 Gerrit-PatchSet: 4 Gerrit-Owner: Todd LipconGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Sat, 18 Nov 2017 04:27:00 + Gerrit-HasComments: No
[kudu-CR] KUDU-2215. kernel stack watchdog: avoid blocking thread exit
Hello Tidy Bot, Andrew Wong, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8536 to look at the new patch set (#4). Change subject: KUDU-2215. kernel_stack_watchdog: avoid blocking thread exit .. KUDU-2215. kernel_stack_watchdog: avoid blocking thread exit This changes the stack watchdog so that thread unregistration no longer blocks if the watchdog thread is in the middle of dumping a stack. This is to try to avoid cases where a user thread is waiting to join on another thread, but that thread is blocked due to watchdog interference. A new stress-test/benchmark verifies the improvement. It simulates slow stack trace collection by injecting latency into the watchdog thread, and then starts and joins threads in a loop for 5 seconds. Without the fix, it was only able to start about 1000 threads/second, whereas with the fix it's able to start 10,000 threads/second. Change-Id: Ib6a349666e8484c00b2f43c5918205ec1a4c09ab --- M src/kudu/util/fault_injection.cc M src/kudu/util/fault_injection.h M src/kudu/util/kernel_stack_watchdog.cc M src/kudu/util/kernel_stack_watchdog.h M src/kudu/util/stack_watchdog-test.cc 5 files changed, 149 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/8536/4 -- To view, visit http://gerrit.cloudera.org:8080/8536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib6a349666e8484c00b2f43c5918205ec1a4c09ab Gerrit-Change-Number: 8536 Gerrit-PatchSet: 4 Gerrit-Owner: Todd LipconGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2215. kernel stack watchdog: avoid blocking thread exit
Hello Tidy Bot, Andrew Wong, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8536 to look at the new patch set (#3). Change subject: KUDU-2215. kernel_stack_watchdog: avoid blocking thread exit .. KUDU-2215. kernel_stack_watchdog: avoid blocking thread exit This changes the stack watchdog so that thread unregistration no longer blocks if the watchdog thread is in the middle of dumping a stack. This is to try to avoid cases where a user thread is waiting to join on another thread, but that thread is blocked due to watchdog interference. A new stress-test/benchmark verifies the improvement. It simulates slow stack trace collection by injecting latency into the watchdog thread, and then starts and joins threads in a loop for 5 seconds. Without the fix, it was only able to start about 1000 threads/second, whereas with the fix it's able to start 10,000 threads/second. Change-Id: Ib6a349666e8484c00b2f43c5918205ec1a4c09ab --- M src/kudu/util/fault_injection.cc M src/kudu/util/fault_injection.h M src/kudu/util/kernel_stack_watchdog.cc M src/kudu/util/kernel_stack_watchdog.h M src/kudu/util/stack_watchdog-test.cc 5 files changed, 148 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/8536/3 -- To view, visit http://gerrit.cloudera.org:8080/8536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib6a349666e8484c00b2f43c5918205ec1a4c09ab Gerrit-Change-Number: 8536 Gerrit-PatchSet: 3 Gerrit-Owner: Todd LipconGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2215. kernel stack watchdog: avoid blocking thread exit
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8536 ) Change subject: KUDU-2215. kernel_stack_watchdog: avoid blocking thread exit .. Patch Set 1: (4 comments) Injection looks good, mostly nits here http://gerrit.cloudera.org:8080/#/c/8536/1/src/kudu/util/kernel_stack_watchdog.h File src/kudu/util/kernel_stack_watchdog.h: http://gerrit.cloudera.org:8080/#/c/8536/1/src/kudu/util/kernel_stack_watchdog.h@176 PS1, Line 176: static void ThreadExiting(void* tls_void); nit: maybe doc here that this is used internally by CreateTLS to create thread-local destructors? http://gerrit.cloudera.org:8080/#/c/8536/1/src/kudu/util/kernel_stack_watchdog.cc File src/kudu/util/kernel_stack_watchdog.cc: http://gerrit.cloudera.org:8080/#/c/8536/1/src/kudu/util/kernel_stack_watchdog.cc@151 PS1, Line 151: vectorto_delete; : { : lock_guard l(tls_lock_); : to_delete.swap(pending_delete_); : tls_map_copy = tls_by_tid_; : } : to_delete.clear(); Is it important to document somewhere that the pending TLS instances only get d'ted in RunThread? Or is that more of an implementation detail. http://gerrit.cloudera.org:8080/#/c/8536/1/src/kudu/util/stack_watchdog-test.cc File src/kudu/util/stack_watchdog-test.cc: http://gerrit.cloudera.org:8080/#/c/8536/1/src/kudu/util/stack_watchdog-test.cc@139 PS1, Line 139: std::t nit: drop std:: http://gerrit.cloudera.org:8080/#/c/8536/1/src/kudu/util/stack_watchdog-test.cc@139 PS1, Line 139: i nit: could replace with `started % threads.size()`? One fewer variable to think about, as trivial as it is -- To view, visit http://gerrit.cloudera.org:8080/8536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib6a349666e8484c00b2f43c5918205ec1a4c09ab Gerrit-Change-Number: 8536 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Mon, 13 Nov 2017 23:45:13 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2215. kernel stack watchdog: avoid blocking thread exit
Hello Andrew Wong, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/8536 to review the following change. Change subject: KUDU-2215. kernel_stack_watchdog: avoid blocking thread exit .. KUDU-2215. kernel_stack_watchdog: avoid blocking thread exit This changes the stack watchdog so that thread unregistration no longer blocks if the watchdog thread is in the middle of dumping a stack. This is to try to avoid cases where a user thread is waiting to join on another thread, but that thread is blocked due to watchdog interference. A new stress-test/benchmark verifies the improvement. It simulates slow stack trace collection by injecting latency into the watchdog thread, and then starts and joins threads in a loop for 5 seconds. Without the fix, it was only able to start about 1000 threads/second, whereas with the fix it's able to start 10,000 threads/second. Change-Id: Ib6a349666e8484c00b2f43c5918205ec1a4c09ab --- M src/kudu/util/fault_injection.cc M src/kudu/util/fault_injection.h M src/kudu/util/kernel_stack_watchdog.cc M src/kudu/util/kernel_stack_watchdog.h M src/kudu/util/stack_watchdog-test.cc 5 files changed, 139 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/8536/1 -- To view, visit http://gerrit.cloudera.org:8080/8536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib6a349666e8484c00b2f43c5918205ec1a4c09ab Gerrit-Change-Number: 8536 Gerrit-PatchSet: 1 Gerrit-Owner: Todd LipconGerrit-Reviewer: Andrew Wong