KeDeng has posted comments on this change. ( http://gerrit.cloudera.org:8080/19392 )
Change subject: [www] add slow scans section ...................................................................... Patch Set 11: (25 comments) Thanks for your reviews. I've change the code as your suggestion. Please help to reconfirm at your convenient time. http://gerrit.cloudera.org:8080/#/c/19392/11//COMMIT_MSG Commit Message: PS11: > Could you add a test for the newly introduced ScannerManager::ListSlowScans Done http://gerrit.cloudera.org:8080/#/c/19392/11//COMMIT_MSG@7 PS11, Line 7: show > section Done http://gerrit.cloudera.org:8080/#/c/19392/11//COMMIT_MSG@9 PS11, Line 9: historical completed scans > the history of completed scans Done http://gerrit.cloudera.org:8080/#/c/19392/11//COMMIT_MSG@10 PS11, Line 10: convenient for us to get slow scans > easy to distinguish slow ones among those Done http://gerrit.cloudera.org:8080/#/c/19392/11//COMMIT_MSG@12 PS11, Line 12: change the previous scan presentation into completed scan show and : add slow scans show > introduced an extra section to the scan page to show slow scans separately Done http://gerrit.cloudera.org:8080/#/c/19392/11//COMMIT_MSG@13 PS11, Line 13: Slow scan defined only by the time used, which : threshold is --slow_scanner_threshold_ms > A scan is called 'slow' if it takes more time than defined by --slow_scanne Done http://gerrit.cloudera.org:8080/#/c/19392/11//COMMIT_MSG@15 PS11, Line 15: The number of slow scans show is limit by --slow_scan_history_count > The number of elements in the slow scans history is limited by --slow_scan_ Done http://gerrit.cloudera.org:8080/#/c/19392/11//COMMIT_MSG@17 PS11, Line 17: Change-Id: Ibd1dcac8b81c5eefd306e7020c9c52b3f28e603c > BTW, did you consider an alternative approach: instead of two separate list Yes, this is another implementation. I have also considered it. After all, this method is less changed. But when I tested this way, I found that it might not achieve my original purpose. The goal of adding this feature is to filter out slow scans, not the slow ones from the recently completed scans. After all, if the most recent records are completed quickly, we will not be able to obtain any information useful for slow scans. http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.h File src/kudu/tserver/scanners.h: http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.h@115 PS11, Line 115: scoped_refptr<ScanDescriptor> > Consider introducing a typedef for scoped_refptr<ScanDescriptor> (say, Shar Done http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.h@117 PS11, Line 117: recently > recent Done http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.h@118 PS11, Line 118: Slow scan defined only by the time used, which threshold is : // --slow_scanner_threshold_ms. > A scan is 'slow' if it takes more than --slow_scanner_threshold_ms to compl Done http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.h@120 PS11, Line 120: The member number of return vector is limit by --slow_scan_history_count > The number of elements in the result vector is limited by --slow_scan_histo Done http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.h@126 PS11, Line 126: whose scan exceeds time threshold > whose scan times exceed the threshold Done http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.h@478 PS11, Line 478: ScanDescriptor is copyable > If I'm not mistaken, ScanDescriptor is copyable even without wrapping it in Yes, you are right. I will update the code annotation. http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.h@479 PS11, Line 479: : public RefCountedThreadSafe<ScanDescriptor> { > Just curious: why RefCountedThreadSafe seems to be more preferable over st The 'RefCounted' is my choice in the alpha version. In this version, there is no problem with the slow scans display function, but some unit tests involve the concurrent use of 'ScanDescriptor'. Finally, I use 'RefCountedThreadSafe' to replace to solve these problems. http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.cc File src/kudu/tserver/scanners.cc: http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.cc@68 PS11, Line 68: 60000 > nit: for better readability, consider replacing with 60 * 1000L Done http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.cc@68 PS11, Line 68: minutes > minute Done http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.cc@69 PS11, Line 69: Number of milliseconds for the threshold of slow scan." > What do you think of introducing a special value for the flag (e.g., -1 or Sounds like good. I will implement this function in the next submission. http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.cc@75 PS11, Line 75: The slow " : "scans define with --slow_scanner_threshold_ms > The threshold for a slow scan is defined with --slow_scanner_threshold_ms Done http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.cc@153 PS11, Line 153: { : std::lock_guard<percpu_rwlock> l(completed_scans_lock_); : completed_scans_.clear(); : } : { : std::lock_guard<percpu_rwlock> l(slow_scans_lock_); : slow_scans_.clear(); : } > Why is it needed? The containers would be automatically cleaned up on dest Done http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.cc@291 PS11, Line 291: desc > Why not to continue using the move semantics here? Done http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.cc@321 PS11, Line 321: unordered_map<string, scoped_refptr<ScanDescriptor>> scans; : { : kudu::shared_lock<rw_spinlock> l(slow_scans_lock_.get_lock()); : for (const auto& scan : slow_scans_) { : InsertOrUpdate(&scans, scan->scanner_id, scan); : } : } > What's the purpose of using unordered dictionary/map here if later on the r This is mainly to prevent the occurrence of duplicate records. http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.cc@355 PS11, Line 355: FLAGS_slow_scanner_threshold_ms > There are 3 accesses to this flag in the scope of this method. Every acces Done http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.cc@361 PS11, Line 361: LOG(INFO) << Substitute( : "Slow scanner id: $0, of tablet $1, " : "exceed the time threshold $2 ms for $3 ms.", : it->first, : scanner->tablet_id(), : FLAGS_slow_scanner_threshold_ms, : delta_time.ToMilliseconds()); > Do you really want this to be logged for every scanner which goes beyond th What I hope is that there are ways to obtain these slow scans logs if necessary. I can turn it into Vlog. What do you think? http://gerrit.cloudera.org:8080/#/c/19392/11/src/kudu/tserver/scanners.cc@436 PS11, Line 436: if (slow_scans_.capacity() == 0) { : return; : } : if (slow_scans_.size() == slow_scans_.capacity()) { : slow_scans_[slow_scans_offset_++] = descriptor; : if (slow_scans_offset_ == slow_scans_.capacity()) { : slow_scans_offset_ = 0; : } : } else { : slow_scans_.emplace_back(descriptor); : } > Instead of duplicating this code from RecordCompletedScanUnlocked(), consid Done -- To view, visit http://gerrit.cloudera.org:8080/19392 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibd1dcac8b81c5eefd306e7020c9c52b3f28e603c Gerrit-Change-Number: 19392 Gerrit-PatchSet: 11 Gerrit-Owner: KeDeng <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: KeDeng <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 03 Feb 2023 08:59:55 +0000 Gerrit-HasComments: Yes
