Todd Lipcon has posted comments on this change.

Change subject: rpc: LIFO service queue
......................................................................


Patch Set 7:

(22 comments)

will push these changes as a new patch

http://gerrit.cloudera.org:8080/#/c/2938/7/build-support/tsan-suppressions.txt
File build-support/tsan-suppressions.txt:

Line 26: # local_addr_space_init(), suppressing init() suppresses both races.
> Apparently this comment is no longer correct. Can you update it?
Done


http://gerrit.cloudera.org:8080/#/c/2938/7/src/kudu/rpc/service_queue-test.cc
File src/kudu/rpc/service_queue-test.cc:

Line 34: using std::vector;
> Nit: unsorted.
Done


Line 62:     InboundCall * call = new InboundCall(nullptr);
> Nit: InboundCall* call
Done


Line 66:       LOG(INFO) << "queue, exit";
> Nit: queue full, exit
Done


Line 90:     call.reset(nullptr);
> Didn't know you needed to provide an argument.
Done


Line 96:   vector<shared_ptr<boost::thread>> producers;
> Since this is a new test, could we use std::thread instead?
yea, and I dont think the shared_ptr is necessary since std::thread is moveable.


Line 99:   for (int i = 0; i< FLAGS_num_producers; i++) {
> Nit: i < FLAGS_
Done


Line 120:     total_queue_len = queue.estimated_queue_length();
        :     total_idle_workers = queue.estimated_idle_worker_count();
> What's the point of calling this inside the loop? Each call overwrites the 
oops, those were supposed to be +=s


http://gerrit.cloudera.org:8080/#/c/2938/7/src/kudu/rpc/service_queue.cc
File src/kudu/rpc/service_queue.cc:

Line 36: bool LifoServiceQueue::BlockingGet(std::unique_ptr<InboundCall> *out) {
> Nit: type* out
Done


Line 39:     lock_guard<simple_spinlock> l(&lock_);
> Nit: technically we need only lock for the consumers_ mutation.
Done


Line 41:     consumers_.emplace_back(consumer);
> AFAICT, consumers_ just accumulates entries until the entire queue is destr
yea, the idea is to avoid the memory leak, though I suppose we could instead 
have the threads delete their own ConsumerState on shutdown. You think that's 
preferable? Mind if I just add a TODO and address it later since there are a 
few other patches in flight for this code?


Line 53:       if (shutdown_) {
> Nit: maybe PREDICT_FALSE here?
Done


Line 59:     InboundCall* call = consumer->Wait();
> Maybe add a comment saying that if 'call' was null, we're shutting down the
Done


Line 80:     // Notify condition var(and wake up consumer thread) takes time,
> Really? Somewhat surprised by that.
yea, the futex_wake has to take some kernel spinlocks and semaphores which 
implies at least some potential for cacheline bounces if not contention. (the 
kernel internally has a map<void*, futex_wait_queue> or something which tracks 
all of the futexes in use by the process, and synchronization is required on 
that map whenever using futex wait/wake. Also, the wake itself needs to get 
that thread back onto a run queue which can cause more lock contention, etc, 
based on the stacks i've seen while profiling this stuff.


Line 89:     DCHECK_EQ(queue_.size(), max_queue_size_);
> This seems somewhat odd given the condition on L87. Defensive programming, 
yea, I think the point is that line 87 is the >=, but here we're just checking 
== because we don't expect it to ever get >


Line 91:     --it;
> Should probably enforce that max_queue_size is > 0. That would be dumb, but
k, added a CHECK in the ctor


Line 110:     cs->Post(nullptr);
> Here we're not posting outside of lock_ the way we do in Put().
yea, doesn't much matter to optimize for the CPU performance of Shutdown()


http://gerrit.cloudera.org:8080/#/c/2938/7/src/kudu/rpc/service_queue.h
File src/kudu/rpc/service_queue.h:

Line 51: // Each consumer thread will has its own lock and condition variable. 
If a
> Nit: probably don't need the word "will" here.
Done


Line 58: // - the worker who was most recently busy is the one which we be 
selected for
> Nit: which will be
Done


Line 76:   bool BlockingGet(std::unique_ptr<InboundCall> *out);
> Nit: std::unique_ptr<InbouundCall>* out
Done


Line 105:     ANNOTATE_IGNORE_READS_BEGIN();
> The goal here being to return the size (or worker count, below) without acq
yea. Binglin wrote this originally but I guess the reason is so that the 
benchmarks which poll this variable don't themselves cause any contention


Line 207:   std::vector<std::unique_ptr<ConsumerState> > consumers_;
> Nit: C++11, so you can do >> to close the template.
old habits


-- 
To view, visit http://gerrit.cloudera.org:8080/2938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1eb677dd52f89683eb648b42918fcf51437215
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Binglin Chang <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to