potiuk edited a comment on pull request #13278:
URL: https://github.com/apache/airflow/pull/13278#issuecomment-755300173


   > I still feel migration is the best place.
   > 
   > > For the db fix up, I am thinking that alembic might not be the right 
tool since it's supposed to be ran before deploying the application code. If we 
fix up the value through alembic, the existing code could still write invalid 
values into the db until it has been replaced by new code.
   > 
   > Once a new version is released (i.e 2.0.1) -- first step is to run 
migration at which point the application code uses the new code -- so I don't 
think it will have invalid values.. unless I am missing something.
   
   I think this depends on how users do the migration. I think nowhere in 
Airflow document we have this:
   
   1) shut down all your workers (CeleryExecutor) and wait for all your tasks 
to complete (KubernetesExecutor)
   2) shut down your webserver
   3) shut down scheduler
   4) perform migration
   5) start your webserver
   6) start your workers
   7) start your scheduler
   
   While this seems a reasonable approach (that's how I would do it), I do not 
think we provide any tooling nor instructions for it (like a tool to tell "OK 
all tasks completed, you can now safely migrate Airflow").  
   
   The users might have different expectations (i.e. live migration, or at 
least not waiting for Kubernetes task Pods to complete in order to do the 
migration). And I can very easily imagine that if someone uses a 
KubernetesExecutor, some of the task from the old airflow might still be 
running, especially in KubernetesExecutor case.
   
   I believe in 2.0 we do even more in the running tasks when they complete 
(the 'dependent tasks small run after task completes). and I think this part of 
the code essentially modifies state as well. So I can very easily imagine 
situation when 'main' airflow is already migrated but some task still running 
from previous version will modify the state.
   
   Similarly with the webserver - I guess when you trigger tasks, you can also 
change the state (am I right?) so what happens if you have a new webserver and 
old db or old webserver and already migrated db? There are different deployment 
mechanisms and it is hard to make an "atomic" change to your deployment where 
all components are upgraded precisely at the same time - sometimes it is even 
impossible.
   
   I think we should either:
   
   a) document and even somehow maybe enforce that migration cannot be run 
while old tasks are running (and possibly webserver)
   or
   b) handle the case when those tasks are running during migration
   
   Otherwise we risk to be flooded by "inconsistent DB" issues from our users 
and we will have to figure out how to fix it.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to