Chais opened a new issue, #53667:
URL: https://github.com/apache/airflow/issues/53667

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### If "Other Airflow 2 version" selected, which one?
   
   2.10.5
   
   ### What happened?
   
   I have a task that filters a list based on values of a second list. As such 
it depends on two upstream tasks, each producing one of the lists. The second 
upstream task is mapped dynamically.
   
   With the default trigger rule `all_success` the task in question will skip 
if any of the mapped tasks skips, which feels counter-intuitive.
   
   With the `all_done` trigger rule the task behaves as expected, waiting for 
all mapped tasks to finish and executing with the resulting mapped list of 
outputs.  
   However, if _all_ of the mapped tasks skip, which is a valid result in this 
context, the tasks still tries to execute.
   
   This might be related to #51320.
   
   ### What you think should happen instead?
   
   The task receives the original list and instead of the second list with 
filter values it receives `None`. I think this happens because the mapped 
return values of skipped tasks (`None`) are reduced to `None`.  
   In my opinion it would be more appropriate to reduce them to `[]`.
   
   Apart from that, the task should just be skipped.
   
   ### How to reproduce
   
   ```python
   from random import random
   from typing import Any, List
   
   import pendulum
   from airflow.decorators import dag, task, task_group
   from airflow.exceptions import AirflowSkipException
   
   
   @dag("playground", "Try things", schedule=None, start_date=pendulum.now())
   def playground():
       @task.python
       def produce_list() -> List[int]:
           return list(range(25))
   
       @task.python
       def maybe_abort(value: int) -> int:
           if random() > 0.8:
               return value
           raise AirflowSkipException()
   
       @task.python
       def filter_list(values1: List[int], values2: List[int]) -> List[int]:
           # if values1 is None or values2 is None:
           #     raise AirflowSkipException()
           return [v for v in values1 if v in values2]
   
       values1 = produce_list()
       values2 = maybe_abort.expand(value=values1)
       filter_list.override(trigger_rule="all_done")(values1, values2)
   
   
   playground()
   
   if __name__ == "__main__":
       playground().test()
   ```
   
   This issue can be worked around with the commented check, but I don't think 
we should have to validate task inputs.
   
   ### Operating System
   
   Ubuntu 24.04
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to