anshuksi282-ksolves commented on PR #56476:
URL: https://github.com/apache/airflow/pull/56476#issuecomment-3381910257

   > Fantastic catch. I was wondering why we had ended up with 1000s of dag 
versions. I think this has really been hurting performance. What will be the 
best way to deal with the numerous duplicate dag versions that have been 
filling up the db?
   
   Hi, thanks! Yes, the numerous duplicate DAG versions are caused by the 
previous non-deterministic serialization. My fix ensures that future DAG parses 
will generate deterministic hashes, so logically identical DAGs will no longer 
create new unnecessary versions.
   
   For the existing duplicates in the DB, a manual cleanup will be needed, for 
example by keeping only the latest version per DAG.
   
   This should prevent the performance issues caused by repeated unnecessary 
DAG versioning going forward.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to