AetherUnbound opened a new issue, #25353:
URL: https://github.com/apache/airflow/issues/25353

   ### Apache Airflow version
   
   2.3.0
   
   ### What happened
   
   While trying to run a backfill for one of our DAGs, we noticed that the 
backfill stalled after completing all the tasks for a given DAG. The 
`max_active_runs` for this DAG was set to `1`, so the entire backfill stalled 
even though all the tasks in the last DAG it ran completed successfully.
   
   ### What you think should happen instead
   
   I would assume that once all the tasks are complete in a DAG (whether 
succeeded, skipped, or failed) during a backfill, the backfill should mark the 
DAG with the proper state and proceed on with the rest of the tasks.
   
   ### How to reproduce
   
   Here is simulacrum of our DAG with all the actual logic stripped out:
   
   ```python
   from datetime import datetime
   
   from airflow.decorators import dag
   from airflow.exceptions import AirflowSkipException
   from airflow.operators.python import PythonOperator
   from airflow.utils.task_group import TaskGroup
   from airflow.utils.trigger_rule import TriggerRule
   
   
   def skipme():
       raise AirflowSkipException("Skip")
   
   
   def run():
       return
   
   
   @dag(
       schedule_interval="@daily",
       start_date=datetime(2022, 7, 14),
       catchup=False,
       max_active_runs=1,
   )
   def sample_dag_with_skip():
       a = PythonOperator(
           task_id="first",
           python_callable=run,
       )
   
       with TaskGroup(group_id="subgroup") as tg:
           b = PythonOperator(
               task_id="run_and_skip",
               trigger_rule=TriggerRule.NONE_SKIPPED,
               python_callable=skipme,
           )
           c = PythonOperator(
               task_id="run_fine",
               trigger_rule=TriggerRule.NONE_SKIPPED,
               python_callable=skipme,
           )
   
           d = PythonOperator(
               task_id="gather",
               python_callable=run,
           )
   
           e = PythonOperator(
               task_id="always_succeed",
               trigger_rule=TriggerRule.ALL_DONE,
               python_callable=run,
           )
   
           [b, c] >> d >> e
   
       f = PythonOperator(
           task_id="final",
           trigger_rule=TriggerRule.ALL_DONE,
           python_callable=run,
       )
   
       a >> tg >> f
   
   
   skip_dag = sample_dag_with_skip()
   ```
   
   Here's a screenshot of the DAG:
   
![image](https://user-images.githubusercontent.com/10214785/181389328-5183b041-1ba3-483f-b18e-c8e6d5338152.png)
   
   Note that the DAG is still shown as "running" even though the last task 
ended several minutes ago:
   
![image](https://user-images.githubusercontent.com/10214785/181389434-6b528f4b-ece7-4c71-bfa2-2a1f879479c6.png)
   
   Here's the backfill command I ran for this DAG: `airflow dags backfill -s 
2022-07-25 -e 2022-07-26 sample_dag_with_skip --reset-dagruns -y`
   
   And here are the logs from the backfill process:
   
   <details> <summary>Backfill logs</summary>
   
   ```
   airflow@42a81ed08a3d:~$ airflow dags backfill -s 2022-07-25 -e 2022-07-26 
sample_dag_with_skip --reset-dagruns -y
   
/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/cli/commands/dag_command.py:57
 PendingDeprecationWarning: --ignore-first-depends-on-past is deprecated as the 
value is always set to True
   [2022-07-27 23:15:44,655] {dagbag.py:507} INFO - Filling up the DagBag from 
/usr/local/airflow/openverse_catalog/dags
   [2022-07-27 23:15:44,937] {urls.py:74} INFO - 
https://creativecommons.org/publicdomain/zero/1.0 was rewritten to 
https://creativecommons.org/publicdomain/zero/1.0/
   [2022-07-27 23:15:44,948] {media.py:63} INFO - Initialized image MediaStore 
with provider brooklynmuseum
   [2022-07-27 23:15:44,948] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,948] {media.py:186} INFO - Output path: 
/var/workflow_output/brooklynmuseum_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,952] {media.py:63} INFO - Initialized image MediaStore 
with provider europeana
   [2022-07-27 23:15:44,952] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,952] {media.py:186} INFO - Output path: 
/var/workflow_output/europeana_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,953] {media.py:63} INFO - Initialized image MediaStore 
with provider finnishmuseums
   [2022-07-27 23:15:44,953] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,953] {media.py:186} INFO - Output path: 
/var/workflow_output/finnishmuseums_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,955] {media.py:63} INFO - Initialized image MediaStore 
with provider flickr
   [2022-07-27 23:15:44,955] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,955] {media.py:186} INFO - Output path: 
/var/workflow_output/flickr_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,957] {media.py:63} INFO - Initialized audio MediaStore 
with provider freesound
   [2022-07-27 23:15:44,957] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,957] {media.py:186} INFO - Output path: 
/var/workflow_output/freesound_audio_v001_20220727231544.tsv
   [2022-07-27 23:15:44,959] {media.py:63} INFO - Initialized audio MediaStore 
with provider jamendo
   [2022-07-27 23:15:44,959] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,959] {media.py:186} INFO - Output path: 
/var/workflow_output/jamendo_audio_v001_20220727231544.tsv
   [2022-07-27 23:15:44,961] {media.py:63} INFO - Initialized image MediaStore 
with provider met
   [2022-07-27 23:15:44,961] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,961] {media.py:186} INFO - Output path: 
/var/workflow_output/met_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,962] {media.py:63} INFO - Initialized image MediaStore 
with provider museumsvictoria
   [2022-07-27 23:15:44,962] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,962] {media.py:186} INFO - Output path: 
/var/workflow_output/museumsvictoria_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,964] {media.py:63} INFO - Initialized image MediaStore 
with provider nypl
   [2022-07-27 23:15:44,964] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,964] {media.py:186} INFO - Output path: 
/var/workflow_output/nypl_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,965] {media.py:63} INFO - Initialized image MediaStore 
with provider phylopic
   [2022-07-27 23:15:44,965] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,965] {media.py:186} INFO - Output path: 
/var/workflow_output/phylopic_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,967] {media.py:63} INFO - Initialized image MediaStore 
with provider rawpixel
   [2022-07-27 23:15:44,967] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,967] {media.py:186} INFO - Output path: 
/var/workflow_output/rawpixel_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,968] {media.py:63} INFO - Initialized image MediaStore 
with provider sciencemuseum
   [2022-07-27 23:15:44,968] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,968] {media.py:186} INFO - Output path: 
/var/workflow_output/sciencemuseum_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,970] {media.py:63} INFO - Initialized image MediaStore 
with provider smithsonian
   [2022-07-27 23:15:44,970] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,970] {media.py:186} INFO - Output path: 
/var/workflow_output/smithsonian_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,971] {media.py:63} INFO - Initialized image MediaStore 
with provider smk
   [2022-07-27 23:15:44,972] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,972] {media.py:186} INFO - Output path: 
/var/workflow_output/smk_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,974] {media.py:63} INFO - Initialized image MediaStore 
with provider waltersartmuseum
   [2022-07-27 23:15:44,974] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,974] {media.py:186} INFO - Output path: 
/var/workflow_output/waltersartmuseum_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,976] {media.py:63} INFO - Initialized audio MediaStore 
with provider wikimedia_audio
   [2022-07-27 23:15:44,976] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,976] {media.py:186} INFO - Output path: 
/var/workflow_output/wikimedia_audio_audio_v001_20220727231544.tsv
   [2022-07-27 23:15:44,976] {media.py:63} INFO - Initialized image MediaStore 
with provider wikimedia
   [2022-07-27 23:15:44,976] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,977] {media.py:186} INFO - Output path: 
/var/workflow_output/wikimedia_image_v001_20220727231544.tsv
   [2022-07-27 23:15:44,980] {media.py:63} INFO - Initialized image MediaStore 
with provider wordpress
   [2022-07-27 23:15:44,980] {media.py:168} INFO - No given output directory. 
Using OUTPUT_DIR from environment.
   [2022-07-27 23:15:44,980] {media.py:186} INFO - Output path: 
/var/workflow_output/wordpress_image_v001_20220727231544.tsv
   [2022-07-27 23:15:45,043] {executor_loader.py:106} INFO - Loaded executor: 
LocalExecutor
   [2022-07-27 23:15:45,176] {base_executor.py:91} INFO - Adding to queue: 
['airflow', 'tasks', 'run', 'sample_dag_with_skip', 'first', 
'backfill__2022-07-25T00:00:00+00:00', '--ignore-depends-on-past', '--local', 
'--pool', 'default_pool', '--subdir', 'DAGS_FOLDER/simple_backfill_example.py', 
'--cfg-path', '/tmp/tmpfowbb78c']
   [2022-07-27 23:15:50,050] {local_executor.py:79} INFO - QueuedLocalWorker 
running ['airflow', 'tasks', 'run', 'sample_dag_with_skip', 'first', 
'backfill__2022-07-25T00:00:00+00:00', '--ignore-depends-on-past', '--local', 
'--pool', 'default_pool', '--subdir', 'DAGS_FOLDER/simple_backfill_example.py', 
'--cfg-path', '/tmp/tmpfowbb78c']
   [2022-07-27 23:15:50,060] {dagrun.py:647} WARNING - Failed to get task 
'<TaskInstance: sample_dag_with_skip.skipme backfill__2022-07-25T00:00:00+00:00 
[skipped]>' for dag 'sample_dag_with_skip'. Marking it as removed.
   [2022-07-27 23:15:50,061] {dagrun.py:647} WARNING - Failed to get task 
'<TaskInstance: sample_dag_with_skip.always_run 
backfill__2022-07-25T00:00:00+00:00 [success]>' for dag 'sample_dag_with_skip'. 
Marking it as removed.
   [2022-07-27 23:15:50,063] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 0 of 2 | tasks waiting: 7 | succeeded: 0 | running: 1 | failed: 0 
| skipped: 0 | deadlocked: 0 | not ready: 5
   [2022-07-27 23:15:50,071] {dagbag.py:507} INFO - Filling up the DagBag from 
/usr/local/***/openverse_catalog/dags/simple_backfill_example.py
   [2022-07-27 23:15:50,089] {task_command.py:369} INFO - Running 
<TaskInstance: sample_dag_with_skip.first backfill__2022-07-25T00:00:00+00:00 
[queued]> on host 42a81ed08a3d
   [2022-07-27 23:15:50,486] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:15:50,487] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:15:50,488] {base_aws.py:206} INFO - Credentials retrieved 
from login
   [2022-07-27 23:15:50,488] {base_aws.py:100} INFO - Retrieving region_name 
from Connection.extra_config['region_name']
   [2022-07-27 23:15:50,525] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:15:55,064] {dagrun.py:647} WARNING - Failed to get task 
'<TaskInstance: sample_dag_with_skip.skipme backfill__2022-07-25T00:00:00+00:00 
[removed]>' for dag 'sample_dag_with_skip'. Marking it as removed.
   [2022-07-27 23:15:55,065] {dagrun.py:647} WARNING - Failed to get task 
'<TaskInstance: sample_dag_with_skip.always_run 
backfill__2022-07-25T00:00:00+00:00 [removed]>' for dag 'sample_dag_with_skip'. 
Marking it as removed.
   [2022-07-27 23:15:55,067] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 0 of 2 | tasks waiting: 7 | succeeded: 1 | running: 0 | failed: 0 
| skipped: 0 | deadlocked: 0 | not ready: 5
   [2022-07-27 23:15:55,081] {base_executor.py:91} INFO - Adding to queue: 
['airflow', 'tasks', 'run', 'sample_dag_with_skip', 'subgroup.run_and_skip', 
'backfill__2022-07-25T00:00:00+00:00', '--ignore-depends-on-past', '--local', 
'--pool', 'default_pool', '--subdir', 'DAGS_FOLDER/simple_backfill_example.py', 
'--cfg-path', '/tmp/tmp1uesref4']
   [2022-07-27 23:15:55,170] {base_executor.py:91} INFO - Adding to queue: 
['airflow', 'tasks', 'run', 'sample_dag_with_skip', 'subgroup.run_fine', 
'backfill__2022-07-25T00:00:00+00:00', '--ignore-depends-on-past', '--local', 
'--pool', 'default_pool', '--subdir', 'DAGS_FOLDER/simple_backfill_example.py', 
'--cfg-path', '/tmp/tmpib__7p5u']
   [2022-07-27 23:16:00,058] {local_executor.py:79} INFO - QueuedLocalWorker 
running ['airflow', 'tasks', 'run', 'sample_dag_with_skip', 
'subgroup.run_fine', 'backfill__2022-07-25T00:00:00+00:00', 
'--ignore-depends-on-past', '--local', '--pool', 'default_pool', '--subdir', 
'DAGS_FOLDER/simple_backfill_example.py', '--cfg-path', '/tmp/tmpib__7p5u']
   [2022-07-27 23:16:00,058] {local_executor.py:79} INFO - QueuedLocalWorker 
running ['airflow', 'tasks', 'run', 'sample_dag_with_skip', 
'subgroup.run_and_skip', 'backfill__2022-07-25T00:00:00+00:00', 
'--ignore-depends-on-past', '--local', '--pool', 'default_pool', '--subdir', 
'DAGS_FOLDER/simple_backfill_example.py', '--cfg-path', '/tmp/tmp1uesref4']
   [2022-07-27 23:16:00,063] {dagrun.py:647} WARNING - Failed to get task 
'<TaskInstance: sample_dag_with_skip.skipme backfill__2022-07-25T00:00:00+00:00 
[removed]>' for dag 'sample_dag_with_skip'. Marking it as removed.
   [2022-07-27 23:16:00,064] {dagrun.py:647} WARNING - Failed to get task 
'<TaskInstance: sample_dag_with_skip.always_run 
backfill__2022-07-25T00:00:00+00:00 [removed]>' for dag 'sample_dag_with_skip'. 
Marking it as removed.
   [2022-07-27 23:16:00,065] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 0 of 2 | tasks waiting: 5 | succeeded: 1 | running: 2 | failed: 0 
| skipped: 0 | deadlocked: 0 | not ready: 3
   [2022-07-27 23:16:00,077] {dagbag.py:507} INFO - Filling up the DagBag from 
/usr/local/***/openverse_catalog/dags/simple_backfill_example.py
   [2022-07-27 23:16:00,080] {dagbag.py:507} INFO - Filling up the DagBag from 
/usr/local/***/openverse_catalog/dags/simple_backfill_example.py
   [2022-07-27 23:16:00,096] {task_command.py:369} INFO - Running 
<TaskInstance: sample_dag_with_skip.subgroup.run_fine 
backfill__2022-07-25T00:00:00+00:00 [queued]> on host 42a81ed08a3d
   [2022-07-27 23:16:00,097] {task_command.py:369} INFO - Running 
<TaskInstance: sample_dag_with_skip.subgroup.run_and_skip 
backfill__2022-07-25T00:00:00+00:00 [queued]> on host 42a81ed08a3d
   [2022-07-27 23:16:00,504] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:16:00,505] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:16:00,506] {base_aws.py:206} INFO - Credentials retrieved 
from login
   [2022-07-27 23:16:00,506] {base_aws.py:100} INFO - Retrieving region_name 
from Connection.extra_config['region_name']
   [2022-07-27 23:16:00,514] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:16:00,515] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:16:00,516] {base_aws.py:206} INFO - Credentials retrieved 
from login
   [2022-07-27 23:16:00,516] {base_aws.py:100} INFO - Retrieving region_name 
from Connection.extra_config['region_name']
   [2022-07-27 23:16:00,541] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:16:00,559] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:16:05,071] {dagrun.py:647} WARNING - Failed to get task 
'<TaskInstance: sample_dag_with_skip.skipme backfill__2022-07-25T00:00:00+00:00 
[removed]>' for dag 'sample_dag_with_skip'. Marking it as removed.
   [2022-07-27 23:16:05,072] {dagrun.py:647} WARNING - Failed to get task 
'<TaskInstance: sample_dag_with_skip.always_run 
backfill__2022-07-25T00:00:00+00:00 [removed]>' for dag 'sample_dag_with_skip'. 
Marking it as removed.
   [2022-07-27 23:16:05,073] {dagrun.py:583} ERROR - Deadlock; marking run 
<DagRun sample_dag_with_skip @ 2022-07-25 00:00:00+00:00: 
backfill__2022-07-25T00:00:00+00:00, externally triggered: False> failed
   [2022-07-27 23:16:05,073] {dagrun.py:607} INFO - DagRun Finished: 
dag_id=sample_dag_with_skip, execution_date=2022-07-25 00:00:00+00:00, 
run_id=backfill__2022-07-25T00:00:00+00:00, run_start_date=None, 
run_end_date=2022-07-27 23:16:05.073628+00:00, run_duration=None, state=failed, 
external_trigger=False, run_type=backfill, data_interval_start=2022-07-25 
00:00:00+00:00, data_interval_end=2022-07-26 00:00:00+00:00, dag_hash=None
   [2022-07-27 23:16:05,075] {dagrun.py:799} WARNING - Failed to record 
duration of <DagRun sample_dag_with_skip @ 2022-07-25 00:00:00+00:00: 
backfill__2022-07-25T00:00:00+00:00, externally triggered: False>: start_date 
is not set.
   [2022-07-27 23:16:05,075] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 5 | succeeded: 1 | running: 0 | failed: 0 
| skipped: 2 | deadlocked: 0 | not ready: 3
   [2022-07-27 23:16:05,096] {base_executor.py:91} INFO - Adding to queue: 
['airflow', 'tasks', 'run', 'sample_dag_with_skip', 'subgroup.always_succeed', 
'backfill__2022-07-25T00:00:00+00:00', '--ignore-depends-on-past', '--local', 
'--pool', 'default_pool', '--subdir', 'DAGS_FOLDER/simple_backfill_example.py', 
'--cfg-path', '/tmp/tmpulo4p958']
   [2022-07-27 23:16:10,065] {local_executor.py:79} INFO - QueuedLocalWorker 
running ['airflow', 'tasks', 'run', 'sample_dag_with_skip', 
'subgroup.always_succeed', 'backfill__2022-07-25T00:00:00+00:00', 
'--ignore-depends-on-past', '--local', '--pool', 'default_pool', '--subdir', 
'DAGS_FOLDER/simple_backfill_example.py', '--cfg-path', '/tmp/tmpulo4p958']
   [2022-07-27 23:16:10,067] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 3 | succeeded: 1 | running: 1 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 1
   [2022-07-27 23:16:10,084] {dagbag.py:507} INFO - Filling up the DagBag from 
/usr/local/***/openverse_catalog/dags/simple_backfill_example.py
   [2022-07-27 23:16:10,104] {task_command.py:369} INFO - Running 
<TaskInstance: sample_dag_with_skip.subgroup.always_succeed 
backfill__2022-07-25T00:00:00+00:00 [queued]> on host 42a81ed08a3d
   [2022-07-27 23:16:10,500] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:16:10,501] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:16:10,502] {base_aws.py:206} INFO - Credentials retrieved 
from login
   [2022-07-27 23:16:10,502] {base_aws.py:100} INFO - Retrieving region_name 
from Connection.extra_config['region_name']
   [2022-07-27 23:16:10,537] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:16:15,074] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 3 | succeeded: 2 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 1
   [2022-07-27 23:16:15,091] {base_executor.py:91} INFO - Adding to queue: 
['airflow', 'tasks', 'run', 'sample_dag_with_skip', 'final', 
'backfill__2022-07-25T00:00:00+00:00', '--ignore-depends-on-past', '--local', 
'--pool', 'default_pool', '--subdir', 'DAGS_FOLDER/simple_backfill_example.py', 
'--cfg-path', '/tmp/tmp7ifr68s2']
   [2022-07-27 23:16:20,073] {local_executor.py:79} INFO - QueuedLocalWorker 
running ['airflow', 'tasks', 'run', 'sample_dag_with_skip', 'final', 
'backfill__2022-07-25T00:00:00+00:00', '--ignore-depends-on-past', '--local', 
'--pool', 'default_pool', '--subdir', 'DAGS_FOLDER/simple_backfill_example.py', 
'--cfg-path', '/tmp/tmp7ifr68s2']
   [2022-07-27 23:16:20,075] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 2 | running: 1 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:16:20,094] {dagbag.py:507} INFO - Filling up the DagBag from 
/usr/local/***/openverse_catalog/dags/simple_backfill_example.py
   [2022-07-27 23:16:20,114] {task_command.py:369} INFO - Running 
<TaskInstance: sample_dag_with_skip.final backfill__2022-07-25T00:00:00+00:00 
[queued]> on host 42a81ed08a3d
   [2022-07-27 23:16:20,522] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:16:20,523] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:16:20,524] {base_aws.py:206} INFO - Credentials retrieved 
from login
   [2022-07-27 23:16:20,524] {base_aws.py:100} INFO - Retrieving region_name 
from Connection.extra_config['region_name']
   [2022-07-27 23:16:20,561] {base.py:68} INFO - Using connection ID 
'aws_default' for task execution.
   [2022-07-27 23:16:25,082] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:16:30,083] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:16:35,089] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:16:40,093] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:16:45,099] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:16:50,105] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:16:55,112] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:17:00,117] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:17:05,122] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:17:10,128] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:17:15,133] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:17:20,137] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:17:25,141] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:17:30,145] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:17:35,149] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:17:40,153] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:17:45,157] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:17:50,161] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:17:55,166] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:18:00,169] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:18:05,174] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   [2022-07-27 23:18:10,177] {backfill_job.py:367} INFO - [backfill progress] | 
finished run 1 of 2 | tasks waiting: 2 | succeeded: 3 | running: 0 | failed: 0 
| skipped: 3 | deadlocked: 0 | not ready: 0
   ^C[2022-07-27 23:18:11,199] {backfill_job.py:870} WARNING - Backfill 
terminated by user.
   [2022-07-27 23:18:11,199] {local_executor.py:390} INFO - Shutting down 
LocalExecutor; waiting for running tasks to finish.  Signal again if you don't 
want to wait.
   ```
   
   </details>
   
   ---
   
   It's worth noting that I tried to replicate this with a DAG that only had a 
single skip task, and a DAG with skip -> run -> skip, and both succeeded with 
the backfill. So my guess would be that there's an odd interaction with the 
TaskGroup, skipped tasks, trigger rules, and possibly `max_active_runs`.
   
   
   ### Operating System
   
   Debian GNU/Linux 11 (bullseye)
   
   ### Versions of Apache Airflow Providers
   
   ```
   apache-airflow-providers-amazon==3.3.0
   apache-airflow-providers-ftp==2.1.2
   apache-airflow-providers-http==2.1.2
   apache-airflow-providers-imap==2.2.3
   apache-airflow-providers-postgres==4.1.0
   apache-airflow-providers-sqlite==2.1.3
   ```
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   This is a custom configured Docker image, but doesn't deviate too much from 
a standard deployment: 
https://github.com/WordPress/openverse-catalog/blob/main/docker/airflow/Dockerfile
   
   ### Anything else
   
   I'll try to see if I can continue to pare down the DAG to see if there are 
pieces I can throw out and still replicate the error. I don't think I'd be 
comfortable submitting a PR for this one because my gut says it's probably deep 
in the bowels of the codebase 😅 If it's something clear or straightforward 
though, I'd be happy to take a stab at it! I'd just need to be pointed in the 
right direction within the codebase 🙂 
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to