shahar1 commented on issue #33164:
URL: https://github.com/apache/airflow/issues/33164#issuecomment-2254513051
> Example DAG
>
> ```
> from airflow import DAG
> from airflow.decorators import task
> from airflow.operators.empty import EmptyOperator
>
> with DAG(
> 'removed_mapped_tasks',
> schedule='@daily',
> start_date=pendulum.DateTime(2023, 8, 7),
> ) as dag:
>
> @task
> def gen_elements():
> return [1, 2, 3]
>
> @task
> def mapped_task(element):
> return element * 2
>
> mapped_task.expand(element=gen_elements()) >>
EmptyOperator(task_id='end')
> ```
>
> Let it complete and then return one less element from the `gen_elements`
task. Then clear the last DAG run.
>
> The `end` task will not get scheduled because `Task's trigger rule
'all_success' requires all upstream tasks to have succeeded, but found 1
non-success(es). upstream_states=_UpstreamTIStates(success=2, skipped=0,
failed=0, upstream_failed=0, removed=1, done=3),
upstream_task_ids={'mapped_task'}`
>
> In this very simple DAG, the run will be failed with ` scheduler |
[2023-08-09T13:49:56.448+0300] {dagrun.py:651} ERROR - Task deadlock (no
runnable tasks); marking run <DagRun removed_mapped_tasks @ 2023-08-06
21:00:00+00:00: scheduled__2023-08-06T21:00:00+00:00, state:running, queued_at:
2023-08-09 10:49:44.110567+00:00. externally triggered: False> failed`
>
> On more complex structures the deadlock detection might not kick in.
> Example DAG
>
> ```
> from airflow import DAG
> from airflow.decorators import task
> from airflow.operators.empty import EmptyOperator
>
> with DAG(
> 'removed_mapped_tasks',
> schedule='@daily',
> start_date=pendulum.DateTime(2023, 8, 7),
> ) as dag:
>
> @task
> def gen_elements():
> return [1, 2, 3]
>
> @task
> def mapped_task(element):
> return element * 2
>
> mapped_task.expand(element=gen_elements()) >>
EmptyOperator(task_id='end')
> ```
>
> Let it complete and then return one less element from the `gen_elements`
task. Then clear the last DAG run.
>
> The `end` task will not get scheduled because `Task's trigger rule
'all_success' requires all upstream tasks to have succeeded, but found 1
non-success(es). upstream_states=_UpstreamTIStates(success=2, skipped=0,
failed=0, upstream_failed=0, removed=1, done=3),
upstream_task_ids={'mapped_task'}`
>
> In this very simple DAG, the run will be failed with ` scheduler |
[2023-08-09T13:49:56.448+0300] {dagrun.py:651} ERROR - Task deadlock (no
runnable tasks); marking run <DagRun removed_mapped_tasks @ 2023-08-06
21:00:00+00:00: scheduled__2023-08-06T21:00:00+00:00, state:running, queued_at:
2023-08-09 10:49:44.110567+00:00. externally triggered: False> failed`
>
> On more complex structures the deadlock detection might not kick in.
I tried to run this example with the v2.9.3 (first run with original DAG,
remove one element, and clear one DAG run) and I didn't manage to reproduce it
- all tasks succeeded.
I'm closing this issue, please create a new one if you encounter this again
in versions 2.9.3+ (while providing reproducible examples).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]