ephraimbuddy commented on issue #57618: URL: https://github.com/apache/airflow/issues/57618#issuecomment-3946467620
> [@ephraimbuddy](https://github.com/ephraimbuddy) can you provide the steps to reproduce the issue, so that others from the community can pick it? The reproduction is not reliable. When I reproduced it, I ran 4 schedulers & 2 api-servers and ran this dag in a kubernetes cluster: ```python import time from datetime import datetime from airflow import DAG from airflow.operators.python import PythonOperator def hello_world(): print("Doing my job...") time.sleep(10) print("Job done!") def create_dag(dag_id): with DAG( dag_id=dag_id, schedule="15 */1 * * *", start_date=datetime(2025, 1, 1), catchup=False, tags=["test:sadp_environment"], ) as dag: previous = None for i in range(1, 6): t = PythonOperator( task_id=f"task_{i}", python_callable=hello_world, ) if previous: previous >> t previous = t return dag # Register 100 DAGs for n in range(1, 500): dag_id = f"sadp_dag_{n}" globals()[dag_id] = create_dag(dag_id) ``` After some hours, I searched for the issue in the db: ```sql SELECT ti.dag_id, ti.task_id, ti.run_id FROM task_instance ti WHERE ti.try_number = 2 AND NOT EXISTS ( SELECT 1 FROM task_instance_history tih WHERE tih.dag_id = ti.dag_id AND tih.task_id = ti.task_id AND tih.run_id = ti.run_id AND tih.map_index = ti.map_index AND tih.try_number = 1 ); ``` On subsequent, runs, I couldn't reproduce it again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
