Abhishek Chennaka has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17903 )

Change subject: KUDU-1959 - Implement server startup progress page for tablet 
and master servers
......................................................................


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17903/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17903/2//COMMIT_MSG@7
PS2, Line 7: Implement server startup progress page for tablet and master 
servers
> This doesn't describe this patch, does it? Doesn't this patch add metrics,
Was under the impression all the patches should have the same header. Will 
change it accordingly


http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/server/startup_path_handler.cc
File src/kudu/server/startup_path_handler.cc:

http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/server/startup_path_handler.cc@142
PS2, Line 142: int32 StartupPathHandler::StartupProgressPercentageMetric() {
             :   int percent = 0;
             :   percent += (init_progress_.IsStopped()) ? 100 : 0;
             :   percent += (read_filesystem_progress_.IsStopped()) ? 100 : 0;
             :   percent += (start_rpc_server_progress_.IsStopped()) ? 100 : 0;
             :   if (is_tablet_server_) {
             :     if (containers_total_ == 0 && 
read_data_directories_progress_.IsStopped()) {
             :       percent += 100;
             :     } else if (containers_total_ != 0) {
             :       percent += containers_processed_ * 100 / containers_total_;
             :     }
             :   }
             :   else {
             :     percent += (initialize_master_catalog_progress_.IsStopped()) 
? 100 : 0;
             :   }
             :   return percent/4;
             : }
> I'm not convinced that this adds much value, over the other metrics added i
True, but since we have the below metric, we would want to know if the startup 
is completed or not. Felt, this might be more informational rather than having 
a binary value.


http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/server/startup_path_handler.cc@159
PS2, Line 159:
             : int64_t StartupPathHandler::StartupProgressTimeElapsedMetric() {
             :   int64_t time_elapsed = 0;
             :   time_elapsed += init_progress_.TimeElapsed().ToMilliseconds();
             :   time_elapsed += 
read_filesystem_progress_.TimeElapsed().ToMilliseconds();
             :   if (is_tablet_server_) {
             :     time_elapsed += 
start_tablets_progress_.TimeElapsed().ToMilliseconds();
             :   }
             :   else {
             :     time_elapsed += 
initialize_master_catalog_progress_.TimeElapsed().ToMilliseconds();
             :   }
             :   time_elapsed += 
start_rpc_server_progress_.TimeElapsed().ToMilliseconds();
             :   return time_elapsed;
             : }
             :
> I'm similarly unconvinced that this is a useful metric. How do you expect t
Right, this won't be super useful while the server is starting up, but will be 
useful to get a good estimate of how long the server might take to startup once 
restarted, given the amount of data in the tablet servers doesn't vary much. 
Third party monitoring tools can store a history of this metric to 
study/investigate the trend if needed.


http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17903/2/src/kudu/tserver/ts_tablet_manager.cc@1381
PS2, Line 1381:       
METRIC_tablets_opening_time_startup.Instantiate(server_->metric_entity(),
              :           (start_tablets->TimeElapsed()).ToMilliseconds());
> It seems a bit odd to have a "time" metric that only gets updated based on
So the metric will be instantiated only after all the tablets have been started 
or at least attempted to be started. If there are 100 tablets and 99 have been 
processed, this metric won't be present. After 100 tablets have been started or 
attempted to be started, we instantiate this metric with the appropriate value 
and that is never changed.
This doesn't server too much purpose during the startup but to analyze the 
startup time taken, this would be useful.



--
To view, visit http://gerrit.cloudera.org:8080/17903
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a508c3baf0a0d77baf75f36f7bb305a6ad821e1
Gerrit-Change-Number: 17903
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 11 Oct 2021 21:23:02 +0000
Gerrit-HasComments: Yes

Reply via email to