dstandish commented on code in PR #25959:
URL: https://github.com/apache/airflow/pull/25959#discussion_r958747290


##########
tests/models/test_dag.py:
##########
@@ -692,14 +692,14 @@ def test_bulk_write_to_db(self):
                 assert row[0] is not None
 
         # Re-sync should do fewer queries
-        with assert_queries_count(8):
+        with assert_queries_count(16):

Review Comment:
   so, before, we only needed look at the python object to determine whether we 
needed to do any data operations.  if there were dataset references, we'd add 
them.
   
   however we need to also _remove_ them from database if they were there 
before but are no longer on the python objects.
   
   so that means even for dags and tasks that don't have any references, we 
have to check if they did _before_.  and for ones that _do_ have references, we 
have to check that there aren't any in the DB that need to be deleted.
   
   so given that, it makes sense that queries would increase.  now, whether the 
current approach is _optimal_ from a performance perspective, that i'm not sure 
of, but i wanted to at least get it working in a reasonable way.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to