Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17730 )
Change subject: KUDU-1959 - Implement server startup progress page for tablet and master servers ...................................................................... Patch Set 16: (20 comments) http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/server/startup_path_handler.h File src/kudu/server/startup_path_handler.h: http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/server/startup_path_handler.h@79 PS16, Line 79: int Shouldn't this be atomic too? http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/server/startup_path_handler.cc File src/kudu/server/startup_path_handler.cc: http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/server/startup_path_handler.cc@37 PS16, Line 37: StartupProgress& nit: since this is meant to be only call const methods, it should be marked const. http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/server/startup_path_handler.cc@41 PS16, Line 41: else if (percent == 100) { : output->Set(step + "_time", : HumanReadableElapsedTime::ToShortString( : (*startup_step.end_time() - *startup_step.start_time()).ToSeconds())); : } else { : output->Set(step + "_time", : HumanReadableElapsedTime::ToShortString( : (MonoTime::Now() - *startup_step.start_time()).ToSeconds())); : } If switching to the Timer::{Start,Stop} API, consider also adding an ElapsedTime call that uses MonoTime::Now() if the stop time isn't initialized. Then, there'd be no need for the "else if" http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/server/startup_path_handler.cc@72 PS16, Line 72: (*init_progress_.end_time() == MonoTime::Min()) nit: if going the route of making this a more generic Timer class, consider adding a IsStopped() method that does this comparison http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/server/startup_path_handler.cc@100 PS16, Line 100: if (tablets_total_ == 0 && *start_tablets_progress_.end_time() == MonoTime::Min()) { : output->Set("start_tablets_status", 0); : output->Set("start_tablets_time", HumanReadableElapsedTime::ToShortString(0)); : } else if (tablets_total_ == 0 && *start_tablets_progress_.end_time() != MonoTime::Min()) { : output->Set("start_tablets_status", 100); : output->Set("start_tablets_time", HumanReadableElapsedTime::ToShortString(0)); : } else { : percent = tablets_processed_ * 100 / tablets_total_; : SetWebResponse(output, "start_tablets", start_tablets_progress_, percent); : } nit: don't we have a method for this? Consider simplifying this logic to something like SetWebResponse(output, "start_tablets", timer_, tablets_total_ == 0 ? (timer_->IsStopped() ? 100 : 0) : tablets_processed * 100 / tablets_total_); Similar adjustments can maybe be made above with "read_data_directories" http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/server/webserver.cc File src/kudu/server/webserver.cc: http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/server/webserver.cc@195 PS16, Line 195: & nit: move the ampersand to the left http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/server/webserver.cc@734 PS16, Line 734: // Do not use the template for home page if the server's initialization is not complete. nit: the what is clear from this code; it'd be great if you could explain the _why_, since at first glance this is a bit surprising. http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/tserver/ts_tablet_manager.h File src/kudu/tserver/ts_tablet_manager.h: http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/tserver/ts_tablet_manager.h@330 PS16, Line 330: // nit: remove this extra comment line http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/tserver/ts_tablet_manager.cc@390 PS16, Line 390: int This also needs to be atomic, in order to ensure reads and writes happen atomically (no intermediate garbage value is read when concurrently written) http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/tserver/ts_tablet_manager.cc@594 PS16, Line 594: nit: could you move this four spaces to the left? http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/tserver/ts_tablet_manager.cc@1243 PS16, Line 1243: const The value is actually being mutated, so it shouldn't be const. http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/tserver/ts_tablet_manager.cc@1349 PS16, Line 1349: nit: fix spacing http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/tserver/ts_tablet_manager.cc@1353 PS16, Line 1353: if (*tablets_processed == *tablets_total) { : start_tablets->SetEndTime(MonoTime::Now()); : } Why do this here instead of in tablet_server.cc? In general the start and end calls seem to always happen in pairs, so it seems odd to split them up like this. Any particular reason for doing so? http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/util/startup_progress.h File src/kudu/util/startup_progress.h: http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/util/startup_progress.h@19 PS16, Line 19: #include <atomic> nit: add a newline after <atomic> to separate the system headers from our headers http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/util/startup_progress.h@20 PS16, Line 20: #include "kudu/server/webserver.h" nit: we can remove this; it doesn't appear to be used http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/util/startup_progress.h@25 PS16, Line 25: namespace server { nit: we typically only use the kudu namespace for things in the util directory. http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/util/startup_progress.h@28 PS16, Line 28: StartupProgress Now that we've boiled this down to its basics, it looks very much like a pretty simple timer. How about leaning into that and calling this Timer, and making its API mutable Start() and Stop() that just set the start and end times to MonoTime::Now()? That way all the call-sites don't have to call MonoTime::Now(). In general, it's a good practice to make the API as simple and generic as we can, and in this case there doesn't seem to be anything specific to startup in this class. http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/util/startup_progress.h@32 PS16, Line 32: * nit: this should return a MonoTime, not a const Monotime*. The difference is that returning a MonoTime returns a copy of start_time_, which _should_ be copied while lock_ is held. Once you do that, the method should be marked const too, since the method doesn't mutate state. I.e. class StartupProgress { public: MonoTime start_time() const { std::lock_guard<simple_spinlock> l(lock_); return start_time_; } ... mutable simple_spinlock lock_; }; Same below. http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/util/startup_progress.cc File src/kudu/util/startup_progress.cc: http://gerrit.cloudera.org:8080/#/c/17730/16/src/kudu/util/startup_progress.cc@26 PS16, Line 26: StartupProgress::StartupProgress() : : start_time_(MonoTime::Min()), : end_time_(MonoTime::Min()) { : } : : void StartupProgress::SetEndTime(const MonoTime &end_time) { : std::lock_guard<simple_spinlock> l(lock_); : StartupProgress::end_time_ = end_time; : } : : void StartupProgress::SetStartTime(const MonoTime &start_time) { : std::lock_guard<simple_spinlock> l(lock_); : StartupProgress::start_time_ = start_time; : } nit: IMO this is simple enough that we might as well put this all in the header http://gerrit.cloudera.org:8080/#/c/17730/16/www/startup.mustache File www/startup.mustache: PS16: nit: while we don't have a strict style guide for HTML, it's probably worth keeping the spacing styling consistent with what we have elsewhere e.g. https://github.com/apache/kudu/blob/master/www/masters.mustache -- To view, visit http://gerrit.cloudera.org:8080/17730 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1db1fcf16261d4ced1b3657a697766a5335271b4 Gerrit-Change-Number: 17730 Gerrit-PatchSet: 16 Gerrit-Owner: Abhishek Chennaka <achenn...@cloudera.com> Gerrit-Reviewer: Abhishek Chennaka <achenn...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Attila Bukor <abu...@apache.org> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Tue, 05 Oct 2021 08:18:24 +0000 Gerrit-HasComments: Yes