Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/8257 )
Change subject: thread: improve performance of starting threads ...................................................................... Patch Set 3: (10 comments) http://gerrit.cloudera.org:8080/#/c/8257/2/src/kudu/client/client-test.cc File src/kudu/client/client-test.cc: http://gerrit.cloudera.org:8080/#/c/8257/2/src/kudu/client/client-test.cc@4805 PS2, Line 4805: SleepFor(MonoDelta::FromMilliseconds(100)); > Seems like the ideal value here would be different for TSAN/ASAN than for r I guess the thinking is that we don't really know how many client threads are sufficient to drive the load required to overrun the server. I didn't want to redesign the test here, so just tweaked it to avoid the issue where threads started too fast. http://gerrit.cloudera.org:8080/#/c/8257/2/src/kudu/util/thread.cc File src/kudu/util/thread.cc: http://gerrit.cloudera.org:8080/#/c/8257/2/src/kudu/util/thread.cc@a589 PS2, Line 589: > Curious why you moved this out of here and into StartThread? I hit an odd TSAN race with detached threads and thread destructors -- something where I'd started a thread and then exited main(), which ran the ThreadManager destrcutor, and TSAN reported that as not-synchronized-with the SuperviseThread function. I think we used to get some 'happens-before" by the synchronization on the tid_ variable but lost it, so that race surfaced itself. http://gerrit.cloudera.org:8080/#/c/8257/2/src/kudu/util/thread.cc@456 PS2, Line 456: Status ThreadJoiner::Join() { : if (Thread::current_thread() && : Thread::current_thread()->tid() == thread_->tid()) { : > Given tid()'s new behavior, should we demote this to a DCHECK? I think it's fine as is, because either: (a) the calling thread is the current thread, in which case the tid_ would already have been assigned (because we don't call the user functor until after we've assigned it) or: (b) the calling thread is not the current thread, in which case we're going to proceed to join() anyway, so regardless we would block until the thread has truly started (since we're going to block until it finishes). http://gerrit.cloudera.org:8080/#/c/8257/2/src/kudu/util/thread.cc@517 PS2, Line 517: > I presume we don't have ToString() in any hot path? yea I don't think so - only in logging, etc, where the few milliseconds we might need to wait for a thread to start isn't a big deal. Assumption is that worst case we're just *moving* a bit of wait time from start until the first access to the Thread object, on the optimistic assumption that we rarely actually access the tid() immediately after starting anyway. http://gerrit.cloudera.org:8080/#/c/8257/2/src/kudu/util/thread.cc@553 PS2, Line 553: *holder = t; > Surely there's no use case for this? 'holder' is supposed to be a pure OUT haha, I wrote it that way first, but then said to myself "I bet Adar is going to nit-pick this that you shouldn't modify out parameters if you return an unsuccessful status!" But, I agree with you. I'll just change it to simplify the code. http://gerrit.cloudera.org:8080/#/c/8257/2/src/kudu/util/thread.cc@575 PS2, Line 575: { > Could you use a MakeScopedCleanup for this, and cancel it on success? Done http://gerrit.cloudera.org:8080/#/c/8257/2/src/kudu/util/thread.cc@591 PS2, Line 591: > This doesn't seem like something we'd want here, given tid()'s new behavior well, it's only in VLOG, so basically if --v=2 is enabled we'll lose the perf optimization. But the cost of writing a log message is already pretty high so we aren't perf-sensitive in that case. Keep in mind that VLOG lazy-evaluates its arguments. http://gerrit.cloudera.org:8080/#/c/8257/2/src/kudu/util/thread.cc@617 PS2, Line 617: string name = strings::Substitute("$0-$1", t->name(), system_tid); : thread_manager->SetThreadName(name, t->tid_); > Could use tid_ here instead. Done http://gerrit.cloudera.org:8080/#/c/8257/2/src/kudu/util/thread.cc@646 PS2, Line 646: > Could use tid_ here instead too. Done http://gerrit.cloudera.org:8080/#/c/8257/2/src/kudu/util/thread.cc@648 PS2, Line 648: t->Release(); > Maybe add a brief comment explaining that this could cause t to destruct, s Done -- To view, visit http://gerrit.cloudera.org:8080/8257 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5ce7f409ed33548142d1180c9f9653a8f51a7879 Gerrit-Change-Number: 8257 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Fri, 13 Oct 2017 01:10:14 +0000 Gerrit-HasComments: Yes
