Abhishek Chennaka has posted comments on this change. ( http://gerrit.cloudera.org:8080/17903 )
Change subject: KUDU-1959 - Implement server startup progress page for tablet and master servers ...................................................................... Patch Set 3: (7 comments) http://gerrit.cloudera.org:8080/#/c/17903/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17903/2//COMMIT_MSG@7 PS2, Line 7: Implement server startup progress page for tablet and master servers > While the headers may reuse jira numbers, each patch should have unique des Done http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/fs/fs_manager.cc File src/kudu/fs/fs_manager.cc: http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/fs/fs_manager.cc@112 PS2, Line 112: : METRIC_DEFINE_gauge_int64(server, log_block_manager_containers_processing_time_startup, : "Time taken to open all log block containers during server startup", : kudu::MetricUnit::kMilliseconds, : "The total time taken by the server to open all the container" : "files during the startup" > See my other thoughts on elapsed time metrics. Making this metric debug as discussed in https://gerrit.cloudera.org/#/c/17903/2/src/kudu/tserver/ts_tablet_manager.cc@1382 http://gerrit.cloudera.org:8080/#/c/17903/1/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/17903/1/src/kudu/fs/log_block_manager.cc@2561 PS1, Line 2561: ->set_value(containers_total->load()) > Rather than loading from the atomic (which isn't free), how about just incr Done http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/server/startup_path_handler.cc File src/kudu/server/startup_path_handler.cc: http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/server/startup_path_handler.cc@147 PS2, Line 147: > Seem's we're missing the tablets processed counter. Though see my below com Done http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/server/startup_path_handler.cc@142 PS2, Line 142: : int32 StartupPathHandler::StartupProgressPercentageMetric() { : int percent = 0; : percent += (init_progress_.IsStopped()) ? 100 : 0; : percent += (read_filesystem_progress_.IsStopped()) ? 100 : 0; : percent += (start_rpc_server_progress_.IsStopped()) ? 100 : 0; : if (is_tablet_server_) { : if (containers_total_ == 0 && read_data_directories_progress_.IsStopped()) { : percent += 100; : } else if (containers_total_ != 0) { : percent += containers_processed_ * 100 / containers_total_; : } : } else { : percent += (initialize_master_catalog_progress_.IsStopped()) ? 100 : 0; : } : return percent/4; : } > I see. Another approach would be to use METRIC_state() to define a counter So instead of a value between [0,100] the idea is to have it between [0-4], is that correct? In that case we can have it as a steps_remaining counter and count down, too. http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/server/startup_path_handler.cc@159 PS2, Line 159: : int64_t StartupPathHandler::StartupProgressTimeElapsedMetric() { : int64_t time_elapsed = 0; : time_elapsed += init_progress_.TimeElapsed().ToMilliseconds(); : time_elapsed += read_filesystem_progress_.TimeElapsed().ToMilliseconds(); : if (is_tablet_server_) { : time_elapsed += start_tablets_progress_.TimeElapsed().ToMilliseconds(); : } else { : time_elapsed += initialize_master_catalog_progress_.TimeElapsed().ToMilliseconds(); : } : time_elapsed += start_rpc_server_progress_.TimeElapsed().ToMilliseconds(); : return time_elapsed; : } : : > Makes sense. Please add that reasoning to the commit message, i.e. the "why Done http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/tserver/ts_tablet_manager.cc@1381 PS2, Line 1381: ++*tablets_processed; : tablets_num_opened_startup_->Increment(); > I see your point. Though while it seems useful, maybe it's not _that_ usefu Sounds good. Making all the time metrics related to containers and tablets debug level from info level i.e.: tablets_opening_time_startup and log_block_manager_containers_processing_time_startup -- To view, visit http://gerrit.cloudera.org:8080/17903 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0a508c3baf0a0d77baf75f36f7bb305a6ad821e1 Gerrit-Change-Number: 17903 Gerrit-PatchSet: 3 Gerrit-Owner: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Tue, 19 Oct 2021 04:06:41 +0000 Gerrit-HasComments: Yes
