Todd Lipcon has posted comments on this change.

Change subject: kernel_stack_watchdog: avoid blocking threads starting
......................................................................


Patch Set 2:

(2 comments)

To find the root causes I was basically just looking at gstacks and adding 
LOG_IF_SLOW calls in various places, nothing too fancy.

http://gerrit.cloudera.org:8080/#/c/4626/2//COMMIT_MSG
Commit Message:

PS2, Line 12: TSAN defers signal-handling
> Just so I understand, what you mean is that TSAN handles the signal but tak
yea, I did some "LOG_IF_SLOW" on the Register(TLS) function and found that it 
was sometimes blocked for 100+ ms, and usually at the same time as the watchdog 
was attempting to dump some stack.


PS2, Line 21: However, it's still important to prevent these
            : threads from _exiting_ while we are looking at their TLS
> But presumably delaying Thread.Join() has little to no effect on test flaki
yea, there's a comment in the code to that effect. Thread _exits_ are basically 
never on a critical path, whereas thread creation often is (eg starting 
threadpool workers)


-- 
To view, visit http://gerrit.cloudera.org:8080/4626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7af85ade6ec9050843ec5b146d26c2549c503d8f
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to