jscheffl commented on PR #48530: URL: https://github.com/apache/airflow/pull/48530#issuecomment-2766886959
> Yeah, I am ok with 5. 10 is excessive -- give the time. > > My reasoning for this is the same why we use SimpleAuthManager for dev/first-time user. Folks using it in Production, can change the setting. It’s intentionally lightweight and requires no setup, to reduce friction when people are getting started with the SDK or running quick experiments. But with 10 retries and if the API server is unreachable or misconfigured (wrong port, not running, etc.), the current retry setup causes a silent 5-minute delay before surfacing any error — no logs, no explanation, just apparent stalling. That’s IMO a frustrating experience. > > Given the retry strategy is exponential (1s, 3s, 7s, 15s, …), 5 retries gets you to ~30 seconds, which is ok'ish. Okay, this means I strongly need to remember this before we can have a stable 3.0 rolled to prod :-D As we use Git Sync and the GIT repo is quire large it already takes ~1min just to GIT close, w/o the time to pull docker image and start the webserver. Unable to restart a backend (if HA fails) in 30 seconds. I still do not understand the resoning behind first time users. Is it just about seeing the details of stack traces before reaching 30 seconds? Also the log details in tenacity can be adjusted if this is a problem. I really feel like this is a burden for people who want to get from "first time setup" to "stable prod". Follow-up might be we need to make the hidden parameters official in the config and add this to the best practices. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
