potiuk edited a comment on pull request #15714:
URL: https://github.com/apache/airflow/pull/15714#issuecomment-841816345


   I think what also matters is how SQLAlchemy works and how we are using it.. 
I am not really concerned too much about changing the default to READ_COMMITTED 
for all transactions, because:
   
   a) we have very short transactions usually  and we are usually working on 
the same set of rows/tables as we retrieve in the first query in the transaction
   
   b) SQL Alchemy will retrieve the rows we are working on and store them as 
objects in memory and only when we flush them /commit transaction SQL alchemy 
will merge the change back.
   
   c) Scheduler works (in 2.0) in a small "tight" loops. Basically, it will 
retrieve N records (say first 100 matching the crirteria), lockng them and then 
only that thread of that scheduler will perform any changes to those rows and 
related data - merging them back). Then it commits and goes back and retrieves 
the next 100 matching rows. So the contention and parallel access to same rows 
is not really possible (under normal circumstances). 
   
   Also the scheduler (which is the important one) uses indeed SKIP LOCKED (but 
I believe locking the same Gaps by different schedulers might cause the 
deadlocks in some scenarios even if SKIP LOCKED is used). 
   
   @ashb -> I might not have the whole picture so maybe you can comment here. 
BTW. I think it might be one of interesting topics of the talk of yours at the 
Summit :) https://twitter.com/AshBerlin/status/1393861492282429443
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to