jadami10 commented on issue #9023:
URL: https://github.com/apache/pinot/issues/9023#issuecomment-1184790620

   We have the same problem where servers take too long to go healthy and the 
ASG recycles them forever. This is for intentional restarts as well as 
unintentional ones. Rewording what you said to see if we agrees; there does 
seem to be some issue here where the `/health` endpoint is doing too much:
   - The infrastructure responsible for making sure the instance or container 
is healthy really just wants to know, "are you working correctly?". Which 
really as long as the server is up and doing something should return 200.
   - The infrastructure used to coordinate rolling restarts wants to know, "are 
you up and have loaded all your segments".
   
   I'm +1 for your counter proposal. We can use `/health/instance` for whatever 
instance/container manager is being used and leave `/health` as is for 
backwards compatibility.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to