Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/16169 )
Change subject: tablet: acquire/release locks in batches ...................................................................... Patch Set 2: (12 comments) http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc File src/kudu/tablet/lock_manager.cc: http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@256 PS2, Line 256: Process > nit: maybe rename this to RemoveFromBucket() or somesuch? Done http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@283 PS2, Line 283: prefetch > I'm curious how you determined that this was a point of slowness. Or is the This kind of batching+prefetching for hashtable lookup is a pretty common trick in the database hash join literature (eg Impala does it) so I knew from experience that it would probably be helpful. In terms of the profile, without the prefetching I can see the hot instruction is on the dependency chain that reads from the bucket, and profiling with 'perf record -e cache-misses' confirms that. So, I gave the prefetching a try and it improved this function's consumption by 2x or something. In response to your question about "why 4" I fiddled with it and found a larger number improved things. Too large a number and it has diminishing returns (the CPU can only have a finite number of outstanding cache misses in flight at a time) and at some point it starts to harm things because the later prefetches might evict something pulled in by the earlier prefetch. Otherwise we could just skip the batching entirely and split the whole loop. http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@383 PS2, Line 383: : void ScopedRowLock::ReleaseBatch(ArrayView<ScopedRowLock*> locks) { : if (locks.empty()) return; : for (auto* srl : locks) srl->Release(); : } > I might be missing something -- is this used anywhere? Done http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@414 PS2, Line 414: void LockManager::Lock(const Slice& key, : const OpState* op, : LockManager::LockMode mode, : LockEntry** entry) { : *entry = locks_->GetLockEntry(key); : AcquireLockOnEntry(*entry, op, mode); : } : > Is this only around for tests? removed http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@422 PS2, Line 422: void LockManager::AcquireLockOnEntry(LockEntry* entry, const OpState* op, LockManager::LockMode mode) { > warning: method 'AcquireLockOnEntry' can be made static [readability-conver Done http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@422 PS2, Line 422: void LockManager::AcquireLockOnEntry(LockEntry* entry, const OpState* op, LockManager::LockMode mode) { > warning: parameter 'mode' is unused [misc-unused-parameters] Done http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@447 PS2, Line 447: int waited_seconds = 0; > warning: unused variable 'waited_seconds' [clang-diagnostic-unused-variable Done http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@449 PS2, Line 449: const OpState* cur_holder = ANNOTATE_UNPROTECTED_READ(entry->holder_); > warning: Value stored to 'cur_holder' during its initialization is never re Done http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@449 PS2, Line 449: const OpState* cur_holder = ANNOTATE_UNPROTECTED_READ(entry->holder_); > warning: unused variable 'cur_holder' [clang-diagnostic-unused-variable] Done http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@450 PS2, Line 450: /* : LOG(WARNING) << "Waited " << (++waited_seconds) << " seconds to obtain row lock on key " : << KUDU_REDACT(key.ToDebugString()) << " cur holder: " << cur_holder; : : TODO: reinstate the above : : */ : // TODO(unknown): would be nice to also include some info about the blocking op, : // but it's a bit tricky to do in a non-racy fashion (the other op may : // complete at any point) : } : MicrosecondsInt64 wait_us = GetMonoTimeMicros() - start_wait_us; : TRACE_COUNTER_INCREMENT("row_lock_wait_us", wait_us); : if (wait_us > 100 * 1000) { : //TRACE("Waited $0us for lock on $1", wait_us, KUDU_REDACT(key.ToDebugString())); : // TODO reinstate > Were these commented out for the sake of avoiding timing calls? just commented out in the course of hacking stuff together because I had lost the local 'key' variable. Turns out it was easy to fix by just using entry->key http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/lock_manager.cc@474 PS2, Line 474: LockManager::LockMode mode, > warning: parameter 'mode' is unused [misc-unused-parameters] Done http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/ops/write_op.h File src/kudu/tablet/ops/write_op.h: http://gerrit.cloudera.org:8080/#/c/16169/2/src/kudu/tablet/ops/write_op.h@252 PS2, Line 252: ScopedRowLock rows_lock; // TODO make private > warning: missing username/bug in TODO [google-readability-todo] Done -- To view, visit http://gerrit.cloudera.org:8080/16169 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3cb724e953ecdf188a35181c2f91b721b3416524 Gerrit-Change-Number: 16169 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <andrew.w...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Comment-Date: Tue, 14 Jul 2020 16:44:18 +0000 Gerrit-HasComments: Yes