rbankar7 opened a new issue, #18090: URL: https://github.com/apache/druid/issues/18090
## Affected versions - All the users of Kubernetes extension would probably be affected by this ## Description - In the druid k8s extension the data nodes have lifecycle stage announcement which is executed before termination - this particular stage would result in pod unnanouncement when it is in terminating state - The effect of this is other druid master nodes and broker being aware of the termination and would stop assigning tasks/segments or routing queries to this particular pod - In case of the node not ready issue the processing on pod stops abruptly, which results in no "unannouncement" being made to the other nodes - Master nodes and brokers/routers would continue to detect this particular node which would result in high latency for the queries and monotonically increasing loadQueue count or in case of indexer high ingestion lag which is monotonically increasing - This would go on till the issue is fixed for the indexers or the retention period has passed for the historical ## reproducing the issue - on a test cluster deployed druid where we would deliberately disable the unnanouncing the node in the end - Also added sleep after the stop() along with increasing the gracefulTerminationPeriod which resulted in pod being in terminating state for a long time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
