Jean-Daniel Cryans created KUDU-1959:
----------------------------------------

             Summary: Hard to tell when a cluster is done starting up
                 Key: KUDU-1959
                 URL: https://issues.apache.org/jira/browse/KUDU-1959
             Project: Kudu
          Issue Type: Improvement
            Reporter: Jean-Daniel Cryans


Restarting a cluster that has a good amount of data, it's hard to tell when 
it's "done". Right now the things I do:
 - Run ksck, wait until most tablets are not in "unavailable" or "boostrapping" 
state.
 - Watch the metrics and see when the data under management is close to where 
it was before restarting (it grows as tablets are getting bootstrapped).
 - Look at the tablet server web UIs for tablets, compare how many are done 
bootstrapping VS in the process of VS not started.

Ideas on how to improve this:
 - In the master's web UI for tablet servers, show how many tablets are running 
VS not running (I wouldn't add anything about tombstoned tablets)
 - Add metrics for tablets in different states.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to