potiuk edited a comment on pull request #13278: URL: https://github.com/apache/airflow/pull/13278#issuecomment-755300173
> I still feel migration is the best place. > > > For the db fix up, I am thinking that alembic might not be the right tool since it's supposed to be ran before deploying the application code. If we fix up the value through alembic, the existing code could still write invalid values into the db until it has been replaced by new code. > > Once a new version is released (i.e 2.0.1) -- first step is to run migration at which point the application code uses the new code -- so I don't think it will have invalid values.. unless I am missing something. I think this depends on how users do the migration. I think nowhere in Airflow document we have this: 1) shut down all your workers (CeleryExecutor) and wait for all your tasks to complete (KubernetesExecutor) 2) shut down your webserver 3) shut down scheduler 4) perform migration 5) start your webserver 6) start your workers 7) start your scheduler While this seems a reasonable approach (that's how I would do it), I do not think we provide any tooling nor instructions for it (like a tool to tell "OK all tasks completed, you can now safely migrate Airflow"). The users might have different expectations (i.e. live migration, or at least not waiting for Kubernetes task Pods to complete in order to do the migration). And I can very easily imagine that if someone uses a KubernetesExecutor, some of the task from the old airflow might still be running, especially in KubernetesExecutor case. I believe in 2.0 we do even more in the running tasks when they complete (the 'dependent tasks small run after task completes). and I think this part of the code essentially modifies state as well. So I can very easily imagine situation when 'main' airflow is already migrated but some task still running from previous version will modify the state. Similarly with the webserver - I guess when you trigger tasks, you can also change the state (am I right?) so what happens if you have a new webserver and old db or old webserver and already migrated db? There are different deployment mechanisms and it is hard to make an "atomic" change to your deployment where all components are upgraded precisely at the same time - sometimes it is even impossible. I think we should either: a) document and even somehow maybe enforce that migration cannot be run while old tasks are running (and possibly webserver) or b) handle the case when those tasks are running during migration Otherwise we risk to be flooded by "inconsistent" DB issues from our users and we will have to figure out how to fix it. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
