kacpermuda opened a new issue, #47615:
URL: https://github.com/apache/airflow/issues/47615
### Apache Airflow version
main (development)
### If "Other Airflow 2 version" selected, which one?
_No response_
### What happened?
When testing asset scheduling on Airflow 3, i encountered some errors. DAGs
with multiple datasets like:
```
from airflow.providers.standard.operators.bash import BashOperator
from airflow.datasets import Dataset
from airflow import DAG
import datetime as dt
with DAG(
dag_id='dataset_consumer_two_datasets_as_list',
start_date=dt.datetime(2024, 7, 3),
schedule=[Dataset("ds1", {"some_extra": 1}), Dataset("ds2")],
) as dag2:
task_2 = BashOperator(
task_id="task_2",
bash_command="exit 0;",
)
```
display this error in the UI:
```
Traceback (most recent call last):
File "/usr/local/lib/python3.9/urllib/parse.py", line 121, in _decode_args
return tuple(x.decode(encoding, errors) if x else '' for x in args)
File "/usr/local/lib/python3.9/urllib/parse.py", line 121, in <genexpr>
return tuple(x.decode(encoding, errors) if x else '' for x in args)
AttributeError: 'dict' object has no attribute 'decode'
```
The same happens for any schedule using more than one dataset:
```
with DAG(
dag_id='dataset_consumer_two_datasets_as_list',
start_date=dt.datetime(2024, 7, 3),
schedule=[Dataset("ds1", {"some_extra": 1}), Dataset("ds2")],
) as dag2:
task_2 = BashOperator(
task_id="task_2",
bash_command="exit 0;",
)
with DAG(
dag_id='dataset_consumer_two_datasets_with_or',
start_date=dt.datetime(2024, 7, 3),
schedule=(Dataset("ds1", {"some_extra": 1}) | Dataset("ds2")),
) as dag3:
task_3 = BashOperator(
task_id="task_3",
bash_command="exit 0;",
)
with DAG(
dag_id='dataset_consumer_complex_logical_dependency',
start_date=dt.datetime(2024, 7, 3),
schedule=(
(Dataset("ds1", {"some_extra": 1}) | Dataset("ds2"))
& (Dataset("ds3") | Dataset("ds4"))
),
) as dag4:
task_4 = BashOperator(
task_id="task_4",
bash_command="exit 0;",
)
with DAG(
dag_id='dataset_consumer_dataset_or_time_schedule',
start_date=dt.datetime(2024, 7, 3),
schedule=AssetOrTimeSchedule(
timetable=CronTriggerTimetable("0 0 * 4 *", timezone="UTC"),
# datasets=(
assets=(
(Dataset("ds1", {"some_extra": 1}) | Dataset("ds2"))
& (Dataset("ds3") | Dataset("ds4", {"another_extra": 345}))
)
),
) as dag5:
task_5 = BashOperator(
task_id="task_5",
bash_command="exit 0;",
)
```
The interesting part is that single Dataset seems to be working just fine:
```
with DAG(
dag_id='dataset_consumer_single_dataset',
start_date=dt.datetime(2024, 7, 3),
schedule=Dataset("ds1"),
) as dag0:
task_0 = BashOperator(
task_id="task_0",
bash_command="exit 0;",
)
with DAG(
dag_id='dataset_consumer_single_dataset_as_list',
start_date=dt.datetime(2024, 7, 3),
schedule=[Dataset("ds1")],
) as dag1:
task_1 = BashOperator(
task_id="task_1",
bash_command="exit 0;",
)
with DAG(
dag_id='dataset_consumer_dataset_or_time_schedule',
start_date=dt.datetime(2024, 7, 3),
schedule=AssetOrTimeSchedule(
timetable=CronTriggerTimetable("0 0 * 4 *", timezone="UTC"),
assets=Dataset("ds1")
),
) as dag5:
task_5 = BashOperator(
task_id="task_5",
bash_command="exit 0;",
)
```
### What you think should happen instead?
_No response_
### How to reproduce
Run breeze on main and try running any of the above DAGs.
### Operating System
MacOs 15.3.1 (24D70)
### Versions of Apache Airflow Providers
_No response_
### Deployment
Virtualenv installation
### Deployment details
Breeze
### Anything else?
Maybe I'm doing something wrong or my installation is somehow broken, let me
know.
### Are you willing to submit PR?
- [x] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]