MatrixManAtYrService opened a new issue, #26256: URL: https://github.com/apache/airflow/issues/26256
### Apache Airflow version 2.4.0b1 ### What happened I have [this test dag](https://gist.github.com/MatrixManAtYrService/2cf0ebbd85faa2aac682d9c441796c58) which I created to report [this issue](https://github.com/apache/airflow/issues/25210). The idea is that if you unpause "sink" and all of the "sources" then the sources will wait until the clock is like \*:\*:00 and they'll terminate at the same time. Since each source triggers the sink with a dataset called "counter", the "sink" dag will run just once, and it will have output like: `INFO - [(16, 1)]`, that's 16 sources and 1 sink that ran. At this point, you can look at the dataset history for "counter" and you'll see this: <img width="524" alt="Screen Shot 2022-09-08 at 6 07 44 PM" src="https://user-images.githubusercontent.com/5834582/189248999-d31141a4-2d0b-4ec2-9ea5-c4c3536b3a28.png"> So we've got a timestamp, but the "triggered runs" count is empty. That's weird. One run was triggered (and it finished by the time the screenshot was taken), so why doesn't it say `1`? So I redeploy and try it again, except this time I wait several seconds between each "unpause" click, the idea being that maybe some of them fire at 07:16:00 and the others fire at 07:17:00. I end up with this: <img width="699" alt="Screen Shot 2022-09-08 at 6 19 12 PM" src="https://user-images.githubusercontent.com/5834582/189252116-69067189-751d-40e7-89c5-8d1da1720237.png"> So fifteen of them finished at once and caused the dataset to update, and then just one straggler (number 9) is waiting for an additional minute. I wait for the straggler to complete and go back to the dataset view: <img width="496" alt="Screen Shot 2022-09-08 at 6 20 41 PM" src="https://user-images.githubusercontent.com/5834582/189253874-87bb3eb3-2237-42a1-bc3f-9fc210419f1a.png"> Now it's the straggler that is blank, but the rest of them are populated. Continuing to manually run these, I find that whichever one I have run most recently is blank, and all of the others are 1, even if this is the second or third time I've run them ### What you think should happen instead - The triggered runs counter should increment beyond 1 - It should increment immediately after the dag was triggered, not wait until after the *next* dag gets triggered. ### How to reproduce See dags in in this gist: https://gist.github.com/MatrixManAtYrService/2cf0ebbd85faa2aac682d9c441796c58 1. unpause "sink" 2. unpause half of sources 3. wait one minute 4. unpause the other half of the sources 5. wait for "sink" to run a second time 6. view the dataset history for "counter" 7. ask why only half of them are populated 8. manually trigger some sources, wait for them to trigger sink 9. view the dataset history again 10. ask why none of them show more than 1 dagrun triggered @ @ ### Operating System Kubernetes in Docker, deployed via helm ### Versions of Apache Airflow Providers n/a ### Deployment Other 3rd-party Helm chart ### Deployment details see "deploy.sh" in the gist: https://gist.github.com/MatrixManAtYrService/2cf0ebbd85faa2aac682d9c441796c58 It's just a fresh install into a k8s cluster ### Anything else n/a ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
