potiuk commented on code in PR #60919:
URL: https://github.com/apache/airflow/pull/60919#discussion_r2717204287
##########
airflow-core/src/airflow/config_templates/config.yml:
##########
@@ -1446,6 +1446,24 @@ api:
type: string
example: ~
default: "False"
+ worker_refresh_interval:
+ description: |
+ Number of seconds between rolling worker refreshes for the API server.
+ When enabled, workers are periodically restarted using a rolling
restart
+ pattern (new workers spawn before old ones are killed) to prevent
memory
+ accumulation while maintaining zero downtime. Set to 0 to disable.
+ version_added: 3.2.0
+ type: integer
+ example: ~
+ default: "0"
Review Comment:
Should we use some more robust default here?
I guess we could have min 2 workers **by default** and have rolling restarts
enabled by default, and refresh interval set to some default, reasonable value
(say 1hr).
While it increases memory used as explained in the PR description, it is far
more robust for "out-of-the-box" installation, and if users would like to save
memory by limiting it to 1, they still should be able to do so (and then we
could disable the uvicorn restarts altogether). I think it's more important to
have "robust non leaking, gently restarting" by default, rather than
"potentially leaking smaller initial memory".
Since Fast API recommendation is 1 worker per Pod - we could also recommend
to the users - in the same docs - to use the rolling restarts of deployment
instead those restarts that are handled by Uvicorn monitor. And we could add
description to the number of workers paraneter, that setting it to 1 is ok as
long as you have more than one instance of api-server and implement refreshing
rolling restart outside of Airlfow. This way, the users who would like to
change it to 1 (and read the docs that is) - would know that they also need to
take care about potential leaks.
There are even some ways to do it automatically - which we could implement
in our Helm Chart as an option potentially:
https://stackoverflow.com/questions/67423919/k8s-cronjob-rolling-restart-every-day
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]