cryptoe opened a new pull request, #14172: URL: https://github.com/apache/druid/pull/14172
The original intent of the `TaskStartTimeoutFault` was to throw an error if the MSQ job is stuck waiting for tasks. The current approach of a fixed 10 minute timeout does not make sense if you are autoscaling. Adjusted the logic to change the timeout start time to the last successfully launched worker time. This ensure that there is progress in the job since tasks are launching. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
