Todd Lipcon has posted comments on this change. Change subject: rpc: LIFO service queue ......................................................................
Patch Set 7: (22 comments) will push these changes as a new patch http://gerrit.cloudera.org:8080/#/c/2938/7/build-support/tsan-suppressions.txt File build-support/tsan-suppressions.txt: Line 26: # local_addr_space_init(), suppressing init() suppresses both races. > Apparently this comment is no longer correct. Can you update it? Done http://gerrit.cloudera.org:8080/#/c/2938/7/src/kudu/rpc/service_queue-test.cc File src/kudu/rpc/service_queue-test.cc: Line 34: using std::vector; > Nit: unsorted. Done Line 62: InboundCall * call = new InboundCall(nullptr); > Nit: InboundCall* call Done Line 66: LOG(INFO) << "queue, exit"; > Nit: queue full, exit Done Line 90: call.reset(nullptr); > Didn't know you needed to provide an argument. Done Line 96: vector<shared_ptr<boost::thread>> producers; > Since this is a new test, could we use std::thread instead? yea, and I dont think the shared_ptr is necessary since std::thread is moveable. Line 99: for (int i = 0; i< FLAGS_num_producers; i++) { > Nit: i < FLAGS_ Done Line 120: total_queue_len = queue.estimated_queue_length(); : total_idle_workers = queue.estimated_idle_worker_count(); > What's the point of calling this inside the loop? Each call overwrites the oops, those were supposed to be +=s http://gerrit.cloudera.org:8080/#/c/2938/7/src/kudu/rpc/service_queue.cc File src/kudu/rpc/service_queue.cc: Line 36: bool LifoServiceQueue::BlockingGet(std::unique_ptr<InboundCall> *out) { > Nit: type* out Done Line 39: lock_guard<simple_spinlock> l(&lock_); > Nit: technically we need only lock for the consumers_ mutation. Done Line 41: consumers_.emplace_back(consumer); > AFAICT, consumers_ just accumulates entries until the entire queue is destr yea, the idea is to avoid the memory leak, though I suppose we could instead have the threads delete their own ConsumerState on shutdown. You think that's preferable? Mind if I just add a TODO and address it later since there are a few other patches in flight for this code? Line 53: if (shutdown_) { > Nit: maybe PREDICT_FALSE here? Done Line 59: InboundCall* call = consumer->Wait(); > Maybe add a comment saying that if 'call' was null, we're shutting down the Done Line 80: // Notify condition var(and wake up consumer thread) takes time, > Really? Somewhat surprised by that. yea, the futex_wake has to take some kernel spinlocks and semaphores which implies at least some potential for cacheline bounces if not contention. (the kernel internally has a map<void*, futex_wait_queue> or something which tracks all of the futexes in use by the process, and synchronization is required on that map whenever using futex wait/wake. Also, the wake itself needs to get that thread back onto a run queue which can cause more lock contention, etc, based on the stacks i've seen while profiling this stuff. Line 89: DCHECK_EQ(queue_.size(), max_queue_size_); > This seems somewhat odd given the condition on L87. Defensive programming, yea, I think the point is that line 87 is the >=, but here we're just checking == because we don't expect it to ever get > Line 91: --it; > Should probably enforce that max_queue_size is > 0. That would be dumb, but k, added a CHECK in the ctor Line 110: cs->Post(nullptr); > Here we're not posting outside of lock_ the way we do in Put(). yea, doesn't much matter to optimize for the CPU performance of Shutdown() http://gerrit.cloudera.org:8080/#/c/2938/7/src/kudu/rpc/service_queue.h File src/kudu/rpc/service_queue.h: Line 51: // Each consumer thread will has its own lock and condition variable. If a > Nit: probably don't need the word "will" here. Done Line 58: // - the worker who was most recently busy is the one which we be selected for > Nit: which will be Done Line 76: bool BlockingGet(std::unique_ptr<InboundCall> *out); > Nit: std::unique_ptr<InbouundCall>* out Done Line 105: ANNOTATE_IGNORE_READS_BEGIN(); > The goal here being to return the size (or worker count, below) without acq yea. Binglin wrote this originally but I guess the reason is so that the benchmarks which poll this variable don't themselves cause any contention Line 207: std::vector<std::unique_ptr<ConsumerState> > consumers_; > Nit: C++11, so you can do >> to close the template. old habits -- To view, visit http://gerrit.cloudera.org:8080/2938 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibd1eb677dd52f89683eb648b42918fcf51437215 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Binglin Chang <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
