Abhishek Chennaka has posted comments on this change. ( http://gerrit.cloudera.org:8080/17903 )
Change subject: KUDU-1959 - Implement server startup progress page for tablet and master servers ...................................................................... Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/17903/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17903/2//COMMIT_MSG@7 PS2, Line 7: Implement server startup progress page for tablet and master servers > This doesn't describe this patch, does it? Doesn't this patch add metrics, Was under the impression all the patches should have the same header. Will change it accordingly http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/server/startup_path_handler.cc File src/kudu/server/startup_path_handler.cc: http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/server/startup_path_handler.cc@142 PS2, Line 142: int32 StartupPathHandler::StartupProgressPercentageMetric() { : int percent = 0; : percent += (init_progress_.IsStopped()) ? 100 : 0; : percent += (read_filesystem_progress_.IsStopped()) ? 100 : 0; : percent += (start_rpc_server_progress_.IsStopped()) ? 100 : 0; : if (is_tablet_server_) { : if (containers_total_ == 0 && read_data_directories_progress_.IsStopped()) { : percent += 100; : } else if (containers_total_ != 0) { : percent += containers_processed_ * 100 / containers_total_; : } : } : else { : percent += (initialize_master_catalog_progress_.IsStopped()) ? 100 : 0; : } : return percent/4; : } > I'm not convinced that this adds much value, over the other metrics added i True, but since we have the below metric, we would want to know if the startup is completed or not. Felt, this might be more informational rather than having a binary value. http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/server/startup_path_handler.cc@159 PS2, Line 159: : int64_t StartupPathHandler::StartupProgressTimeElapsedMetric() { : int64_t time_elapsed = 0; : time_elapsed += init_progress_.TimeElapsed().ToMilliseconds(); : time_elapsed += read_filesystem_progress_.TimeElapsed().ToMilliseconds(); : if (is_tablet_server_) { : time_elapsed += start_tablets_progress_.TimeElapsed().ToMilliseconds(); : } : else { : time_elapsed += initialize_master_catalog_progress_.TimeElapsed().ToMilliseconds(); : } : time_elapsed += start_rpc_server_progress_.TimeElapsed().ToMilliseconds(); : return time_elapsed; : } : > I'm similarly unconvinced that this is a useful metric. How do you expect t Right, this won't be super useful while the server is starting up, but will be useful to get a good estimate of how long the server might take to startup once restarted, given the amount of data in the tablet servers doesn't vary much. Third party monitoring tools can store a history of this metric to study/investigate the trend if needed. http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/tserver/ts_tablet_manager.cc@1381 PS2, Line 1381: METRIC_tablets_opening_time_startup.Instantiate(server_->metric_entity(), : (start_tablets->TimeElapsed()).ToMilliseconds()); > It seems a bit odd to have a "time" metric that only gets updated based on So the metric will be instantiated only after all the tablets have been started or at least attempted to be started. If there are 100 tablets and 99 have been processed, this metric won't be present. After 100 tablets have been started or attempted to be started, we instantiate this metric with the appropriate value and that is never changed. This doesn't server too much purpose during the startup but to analyze the startup time taken, this would be useful. -- To view, visit http://gerrit.cloudera.org:8080/17903 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0a508c3baf0a0d77baf75f36f7bb305a6ad821e1 Gerrit-Change-Number: 17903 Gerrit-PatchSet: 2 Gerrit-Owner: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 11 Oct 2021 21:23:02 +0000 Gerrit-HasComments: Yes
