kiaradlf opened a new issue, #39477:
URL: https://github.com/apache/airflow/issues/39477
### Description
i am using dynamic task mapping, and have a task that should process data
from different upstream (dynamically mapped) tasks with similar return values.
my task now therefore shows twice in the UI, whereas i would like to have it
as a single mapped task taking the union of multiple upstream tasks as input.
### Use case/motivation
i am implementing an ETL pipeline in Apache AirFlow across different API
endpoints:
- extract: scrape API endpoint
- transform
- load
the docs go into how dynamic task mapping can reuse tasks, which in my case
helps generalize across endpoints for example.
i would then have tasks return wherever they stored their output, such that
the next one can continue where the previous one left off.
in practice there is a bit of branching involved tho (with the scraping of
some endpoints depending on the data of others, e.g. `/foos/` and `/bars/` can
come first, but `/foos/{id}/bazs/` would come after).
nevertheless tho, i would prefer to instantiate my transform/load steps just
once, taking the scraped results as their input, rather than duplicating this
over the branching.
this makes me wonder: is there a way to 'union' `XComArg`s such as to
conciliate the results from tasks in such a way?
i imagine one might add a task taking various mapped tasks as inputs and
simply returning a sequence of their results. however, it would sound like this
would both involve custom implementation, as well as depending on all prior
inputs to finish first, disallowing downstream tasks to trigger until that
point.
am i simply missing something about how to use AirFlow?
i unfortunately failed to find an existing union-like operation defined on
their `XComArg` class.
### Related issues
n/a
### Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]