Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16169 )

Change subject: tablet: acquire/release locks in batches
......................................................................


Patch Set 2:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc
File src/kudu/tablet/lock_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@256
PS2, Line 256: Process
> nit: maybe rename this to RemoveFromBucket() or somesuch?
Done


http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@283
PS2, Line 283: prefetch
> I'm curious how you determined that this was a point of slowness. Or is the
This kind of batching+prefetching for hashtable lookup is a pretty common trick 
in the database hash join literature (eg Impala does it) so I knew from 
experience that it would probably be helpful. In terms of the profile, without 
the prefetching I can see the hot instruction is on the dependency chain that 
reads from the bucket, and profiling with 'perf record -e cache-misses' 
confirms that. So, I gave the prefetching a try and it improved this function's 
consumption by 2x or something.

In response to your question about "why 4" I fiddled with it and found a larger 
number improved things. Too large a number and it has diminishing returns (the 
CPU can only have a finite number of outstanding cache misses in flight at a 
time) and at some point it starts to harm things because the later prefetches 
might evict something pulled in by the earlier prefetch. Otherwise we could 
just skip the batching entirely and split the whole loop.


http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@383
PS2, Line 383:
             : void ScopedRowLock::ReleaseBatch(ArrayView<ScopedRowLock*> 
locks) {
             :   if (locks.empty()) return;
             :   for (auto* srl : locks) srl->Release();
             : }
> I might be missing something -- is this used anywhere?
Done


http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@414
PS2, Line 414: void LockManager::Lock(const Slice& key,
             :                        const OpState* op,
             :                        LockManager::LockMode mode,
             :                        LockEntry** entry) {
             :   *entry = locks_->GetLockEntry(key);
             :   AcquireLockOnEntry(*entry, op, mode);
             : }
             :
> Is this only around for tests?
removed


http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@422
PS2, Line 422: void LockManager::AcquireLockOnEntry(LockEntry* entry, const 
OpState* op, LockManager::LockMode mode) {
> warning: method 'AcquireLockOnEntry' can be made static [readability-conver
Done


http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@422
PS2, Line 422: void LockManager::AcquireLockOnEntry(LockEntry* entry, const 
OpState* op, LockManager::LockMode mode) {
> warning: parameter 'mode' is unused [misc-unused-parameters]
Done


http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@447
PS2, Line 447:     int waited_seconds = 0;
> warning: unused variable 'waited_seconds' [clang-diagnostic-unused-variable
Done


http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@449
PS2, Line 449:       const OpState* cur_holder = 
ANNOTATE_UNPROTECTED_READ(entry->holder_);
> warning: Value stored to 'cur_holder' during its initialization is never re
Done


http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@449
PS2, Line 449:       const OpState* cur_holder = 
ANNOTATE_UNPROTECTED_READ(entry->holder_);
> warning: unused variable 'cur_holder' [clang-diagnostic-unused-variable]
Done


http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@450
PS2, Line 450:       /*
             :       LOG(WARNING) << "Waited " << (++waited_seconds) << " 
seconds to obtain row lock on key "
             :                    << KUDU_REDACT(key.ToDebugString()) << " cur 
holder: " << cur_holder;
             :
             :                    TODO: reinstate the above
             :
             :       */
             :       // TODO(unknown): would be nice to also include some info 
about the blocking op,
             :       // but it's a bit tricky to do in a non-racy fashion (the 
other op may
             :       // complete at any point)
             :     }
             :     MicrosecondsInt64 wait_us = GetMonoTimeMicros() - 
start_wait_us;
             :     TRACE_COUNTER_INCREMENT("row_lock_wait_us", wait_us);
             :     if (wait_us > 100 * 1000) {
             :       //TRACE("Waited $0us for lock on $1", wait_us, 
KUDU_REDACT(key.ToDebugString()));
             :       // TODO reinstate
> Were these commented out for the sake of avoiding timing calls?
just commented out in the course of hacking stuff together because I had lost 
the local 'key' variable. Turns out it was easy to fix by just using entry->key


http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@474
PS2, Line 474:                           LockManager::LockMode mode,
> warning: parameter 'mode' is unused [misc-unused-parameters]
Done


http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/ops/write_op.h
File src/kudu/tablet/ops/write_op.h:

http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/ops/write_op.h@252
PS2, Line 252:   ScopedRowLock rows_lock; // TODO make private
> warning: missing username/bug in TODO [google-readability-todo]
Done



--
To view, visit http://gerrit.cloudera.org:8080/16169
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cb724e953ecdf188a35181c2f91b721b3416524
Gerrit-Change-Number: 16169
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <andrew.w...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Tue, 14 Jul 2020 16:44:18 +0000
Gerrit-HasComments: Yes

Reply via email to