kaxil commented on pull request #19637:
URL: https://github.com/apache/airflow/pull/19637#issuecomment-971875937


   >But isn't that the case that just "clearing" the serialized dags is 
equivalent to emptying the cache? I do not think just cleaning the serialized 
fields will take a lot of time in any sizeable database - it's just marking the 
fields as empty which is mostly almost no-op. It's the compound time of 
re-serializing that will take some time.
   
   Yeah, I am talking about the time between starting the upgrade to be able to 
run the DAG - "downtime time" -- Imagine 1000 files that are serialized 
incrementally as they were added/modified. Now you need to serialize them again 
as we cleared it. So we need to wait until all of these files are re-serialized 
to allow running the DAGs in those files. 
   
   >IMHO marking all dags as "invalid" at each upgrade is far more "resilent" 
approach than reserialization looking also at the cases we had. We've 
introduced "accidental" incompatibilities in serialisation and there is no 
guarantee it won't happen again (and we have no protection/tests preventing it 
from it happening again), so "reserialize all at upgrade" for me is an easy 
solution that helps us dealing with accidental mistakes we can (and will) make.
   
   We need to take greater care for those "accidental" incompatibilities. If we 
don't have tests, we should add tests for it and not the other way around. 
   
   >IMHO marking all dags as "invalid" at each upgrade is far more "resilent" 
approach than reserialization looking also at the cases we had. 
   
   It means a larger unnecessary downtime for most of the users when they are 
just upgrading patch versions. This feels like a bad idea. I shouldn't have to 
wait for all my DAGs to get parsed again when I am just upgrading from 2.2.1 to 
2.2.2 for example.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to