shai-ikko opened a new issue, #40841:
URL: https://github.com/apache/airflow/issues/40841

   ### Apache Airflow version
   
   main (development)
   
   ### If "Other Airflow 2 version" selected, which one?
   
   _No response_
   
   ### What happened?
   
   Airflow has two `start_date` fields on separate structures - the Task, and 
the Task Instance.
   
   The code that interprets the `wait_for_downstream` flag is part of 
`prev_dagrun_dep.py`; before it checks for the presence of unfinished 
downstream tasks, though, it looks at the task `start_date`, comparing it to 
the execution date of the last DAG run -- which seems perfectly logical.
   
   
https://github.com/apache/airflow/blob/63662044583031fc27d98af02f2913d324245db0/airflow/ti_deps/deps/prev_dagrun_dep.py#L159-L163
   
   However, I'm seeing the task start_date on a sensor updated with each new 
DAG run. As a result, it is never less than the execution date of the last DAG 
run -- Airflow always thinks all instances are "the first instance of its task".
   
   I encountered this  using `S3KeySensor` for a sensor, but I've checked its 
code, it doesn't touch the `start_date`. Nor does any other task in my DAG, 
AFAICT. I've been able to reproduce the issue in a DAG I can share, that 
doesn't require access to a S3 service.
   
   The problem is also reflected through the web UI -- when I open the details 
of a task instance after some future runs have occurred, and click "more 
details", I see an instance start-date that is older than the task start-date; 
because the task start-date is updated later.
   
   I included more detailed investigations in the discussion, 
https://github.com/apache/airflow/discussions/40451
   
   ### What you think should happen instead?
   
   As a result of the wrong task start_date, `wait_for_downstream` doesn't work.
   I have a dag where the sensor is marked `wait_for_downstream=True`, looking 
for a file with a given suffix to show up. The task immediately downstream from 
the sensor renames such files, giving them a different suffix. Because of the 
failure, the next instance of the sensor often finds the file before it was 
renamed, but of course, only one of the renaming task instances can succeed; 
the other one fails, which triggers all sorts of handling.
   
   ### How to reproduce
   
   Take the DAG from 
https://gist.github.com/shai-ikko/45fc6ae32556fbed519a0b2a3007d8a2 -- it has 
some more downstream processing after the problem is triggered, but I think it 
helps to clarify the issue.
   
   Using Breeze, put this DAG in `files/dags/`, and create a directory 
`files/tmp/`. Start the DAG. To trigger runs, create files with suffix 
`.incoming` in the tmp directory; e.g, from out of the docker,
   ```console
   $ touch files/tmp/use-zx81.incoming
   ```
   Now look at the DAG on the web interface. Following the touch, I see the 
sensor (`detect_incoming`) succeed twice, sometimes even three times, but the 
next task (`send_to_processing`) can only succeed once.
   
   Also, click one of the earlier TI runs, and check its details:
   
![image](https://github.com/user-attachments/assets/2d7638f4-c0ab-4746-8c4a-d9467b7b0a11)
   then click "more details", and compare the `start_date` in the Task Instance 
Attributes, to the `start_date` in the Task Attributes; you'll find that the 
former is earlier than the latter, and you may find the latter matching the 
start date of the last DAG run at the time you're looking at it.
   
   ### Operating System
   
   Ubuntu 20.04.6 LTS
   
   ### Versions of Apache Airflow Providers
   
   No relevant providers
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   ```console
   $ breeze down
   Good version of Docker: 27.0.3.
   Good version of docker-compose: 2.28.1
   ```
   
   
   ### Anything else?
   
   This happens to me every time.
   
   It may be relevant that my native system time is UTC+3 -- in some of the 
logs I've seen times reported to match UTC, and in some logs the times were 6 
hours off, as it seems the native time in the dockers is UTC and something is 
trying to make up for that.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to