applike-ss opened a new issue #12292:
URL: https://github.com/apache/druid/issues/12292


   ### Motivation
   
   Currently we are facing the issue that our historicals are restarting every 
now and then (using kubernetes).
   This can have multiple reasons, but one of them is unexpected exceptions in 
the curator.
   The scenario is that we have a volumes filled with segments already and 
druid starts to initialize/read them.
   This is taking longer than 60 seconds, so kubelet is killing the pod before 
the stage "SERVER" can be reached and the health route is available.
   
   ### Proposed changes
   
   I would like to propose that there is an individual http server running 
already in the "NORMAL" stage which holds routes like:
   * /status/ready (which is only ready when all segments are loaded)
   * /status/alive
   
   Another option would be to have an individual http server serving the 
regular status route. However in this case i would suggest that the response 
include a json with fields based on which the druid administrator can identify 
that druid is alive and ready. Again i would see missing segments (segments to 
load) as a non-ready state. This might make it look flaky when initializing a 
new historical, but i don't see this to be an issue.
   
   ### Rationale
   
   A discussion of why this particular solution is the best one. One good way 
to approach this is to discuss other alternative solutions that you considered 
and decided against. This should also include a discussion of any specific 
benefits or drawbacks you are aware of.
   
   ### Operational impact
   
   * there shouldn't be backwards incompatibilities
   * old status route can either be kept or moved to the new http server 
(latter would introduce an incompatibility)
   * cluster operators would need to adjust their helm charts to use the new 
route and parse the json


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to