anshuksi282-ksolves commented on PR #56476: URL: https://github.com/apache/airflow/pull/56476#issuecomment-3381910257
> Fantastic catch. I was wondering why we had ended up with 1000s of dag versions. I think this has really been hurting performance. What will be the best way to deal with the numerous duplicate dag versions that have been filling up the db? Hi, thanks! Yes, the numerous duplicate DAG versions are caused by the previous non-deterministic serialization. My fix ensures that future DAG parses will generate deterministic hashes, so logically identical DAGs will no longer create new unnecessary versions. For the existing duplicates in the DB, a manual cleanup will be needed, for example by keeping only the latest version per DAG. This should prevent the performance issues caused by repeated unnecessary DAG versioning going forward. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
