hussein-awala commented on issue #29202:
URL: https://github.com/apache/airflow/issues/29202#issuecomment-1407741460
For me it is not a bug, where with `one_success` trigger rule, we execute
the task as soon as at least one parent succeeds, and we do not wait for all
parents to be done.
In your case the parent tasks are the mapped tasks, so after the success of
one of the mapped task, we wait until the next scheduling task, and we take the
output of the finished mapped tasks (we have at least one), then we map a task
for each output, and the same for the next task.
For your use case, I suggest using the Dynamic Mapping Task Group, to have
an etl pipeline for each `symbol` independent of others:
```python
import pendulum
from airflow.decorators import dag, task, task_group
@dag(
schedule="0 0 * * MON-FRI",
start_date=pendulum.datetime(2023, 1, 1, tz="UTC"),
catchup=False,
max_active_runs=1,
)
def etl_dag():
@task
def get_symbols():
res = [('A', 1, 111), ('B', 2, 222)]
return res
@task_group()
def etl_tg(symbol):
@task
def extract(symbol_info, data_interval_end=None):
# Do some work...
return symbol_info
@task
def transform(symbol_info, data_interval_end=None):
# Do some work...
return symbol_info
@task
def load(symbol_info, data_interval_end=None):
# Do some work...
return symbol_info
raw_symbols_data = extract(symbol_info=symbols)
clean_symbols_data = transform(symbol_info=raw_symbols_data)
return load(symbol_info=clean_symbols_data)
# DAG
symbols = get_symbols()
etl_tg(symbols)
etl_dag()
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]