mluvin-stripe opened a new issue, #17599: URL: https://github.com/apache/pinot/issues/17599
## Problem When using the “pause ingestion based on resource utilization” feature ([docs](https://docs.pinot.apache.org/operators/operating-pinot/pause-ingestion-based-on-resource-utilization)), upon restart, controllers initially don’t have their cache of server disk utilization information populated until the ResourceUtilizationChecker periodic task runs. There’s a config `controller.resource.utilization.checker.initial.delay` that we can set to zero seconds to kick off populating the cache immediately, but the controller could still start serving requests before the checker finishes populating the cache since the controller doesn’t wait for the checker to finish before marking itself as ready. This is a problem for minion-based offline segment generation ([code](https://github.com/apache/pinot/blob/b4081d6003347020cc5e38eb3c60638c6aa8f1de/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotTaskManager.java#L332-L337)) and offline segment uploads (new feature proposed in https://github.com/apache/pinot/issues/17557), since the disk utilization check will return UNDETERMINED if the controller’s disk utilization cache isn’t yet populated – so the segment creation/upload is allowed to proceed, even if the disk threshold has already been breached. ## Solution I propose adding an opt-in config `controller.resource.utilization.checker.waitDuringStartup` that ensures the disk utilization cache is populated before marking the controller as ready. This way, the controller is immediately ready to correctly reject segment creation/upload requests after starting up. I was thinking of adding another serviceStatusCallback (like [this one](https://github.com/apache/pinot/blob/b4081d6003347020cc5e38eb3c60638c6aa8f1de/pinot-controller/src/main/java/org/apache/pinot/controller/BaseControllerStarter.java#L758))that checks if the disk utilization cache has been populated yet, and doesn’t return GOOD until it’s populated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
