kaxil commented on pull request #19637: URL: https://github.com/apache/airflow/pull/19637#issuecomment-971875937
>But isn't that the case that just "clearing" the serialized dags is equivalent to emptying the cache? I do not think just cleaning the serialized fields will take a lot of time in any sizeable database - it's just marking the fields as empty which is mostly almost no-op. It's the compound time of re-serializing that will take some time. Yeah, I am talking about the time between starting the upgrade to be able to run the DAG - "downtime time" -- Imagine 1000 files that are serialized incrementally as they were added/modified. Now you need to serialize them again as we cleared it. So we need to wait until all of these files are re-serialized to allow running the DAGs in those files. >IMHO marking all dags as "invalid" at each upgrade is far more "resilent" approach than reserialization looking also at the cases we had. We've introduced "accidental" incompatibilities in serialisation and there is no guarantee it won't happen again (and we have no protection/tests preventing it from it happening again), so "reserialize all at upgrade" for me is an easy solution that helps us dealing with accidental mistakes we can (and will) make. We need to take greater care for those "accidental" incompatibilities. If we don't have tests, we should add tests for it and not the other way around. >IMHO marking all dags as "invalid" at each upgrade is far more "resilent" approach than reserialization looking also at the cases we had. It means a larger unnecessary downtime for most of the users when they are just upgrading patch versions. This feels like a bad idea. I shouldn't have to wait for all my DAGs to get parsed again when I am just upgrading from 2.2.1 to 2.2.2 for example. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
