omkar-foss commented on PR #40963:
URL: https://github.com/apache/airflow/pull/40963#issuecomment-2269504571

   @potiuk when you get some time, can you please review this PR? I've added 
this check to prevent corresponding task instances in parallel DagRuns from 
trying to process the same file (resulting in a race condition), while 
disrespecting the `wait_for_downstream=True`.
   
   You can find the full RCA in [this 
comment](https://github.com/apache/airflow/issues/40841#issuecomment-2245709461).
 Reproducing this issue can be done by loading the DAG in [this 
gist](https://gist.github.com/shai-ikko/45fc6ae32556fbed519a0b2a3007d8a2), and 
then by modifying several files to trigger several DagRuns via the sensor. 
You'll observe that the check in this PR prevents the race condition, but I'm 
not entirely certain that this check is the most elegant way to fix this issue, 
so need your guidance here.
   
   In case you conclude that this check isn't required, please feel free to 
close this PR, but let me know the right approach so I can look into it. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to