hussein-awala opened a new issue, #26200:
URL: https://github.com/apache/airflow/issues/26200

   ### Apache Airflow version
   
   2.3.4
   
   ### What happened
   
   If we have an hourly dag of 2 tasks `T1` and `T2` (`T1 >> T2`), and `T2` has 
`depends_on_past=True`, if at the run of 10:00 `T1` is skipped, `T2` is 
directly marked as skipped too without waiting the task of the previous run 
09:00, then in the next run 11:00, `T2` can be executed where `T2` of its 
previous run is skipped.
   
   ### What you think should happen instead
   
   If the task has `depends_on_past=True`, we should not change its state 
before that the same task of the previous run is marked as `succeeded` or 
`skipped`.
   
   ### How to reproduce
   
   Here is a simple dag which can help to reproduce the problem:
   ```python
   from datetime import datetime
   
   from airflow import DAG
   from airflow.exceptions import AirflowSkipException
   from airflow.operators.bash import BashOperator
   from airflow.operators.python import PythonOperator
   
   with DAG(
           dag_id='dag',
           start_date=datetime(2022, 1, 1),
           schedule_interval=None,
   ) as dag:
       def t1_callable(**context):
           dag_run_conf = context["dag_run"].conf
           if dag_run_conf.get("skip_t1"):
               raise AirflowSkipException()
   
       t1 = PythonOperator(
           task_id="T1",
           python_callable=t1_callable
       )
       t2 = BashOperator(
           task_id="T2",
           bash_command="sleep {{dag_run.conf.get('sleep_seconds', 1)}}",
           depends_on_past=True,
       )
       t1 >> t2
   
   ```
   Then using Airflow CLI:
   ```bash
   airflow dags trigger dag --conf '{"sleep_seconds":1000}'
   airflow dags trigger dag --conf '{"skip_t1":true}'
   airflow dags trigger dag --conf '{"sleep_seconds":10}'
   ```
   
   ### Operating System
   
   Debian GNU/Linux
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   In some cases `max_active_runs` can help to avoid this problem, but we have 
dags with tasks not dependent on past, which we prefer to run ASAP and keep the 
dag running waiting the tasks which depend on past runs.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to