mobuchowski opened a new issue, #41505:
URL: https://github.com/apache/airflow/issues/41505
### Apache Airflow version
2.10.0rc1
### If "Other Airflow 2 version" selected, which one?
_No response_
### What happened?
`get_tree_view` in degenerated case can take a lot of memory.
For a DAG
```
with DAG("aaa_big_get_tree_view", schedule=None) as dag:
first_set = [LongEmptyOperator(task_id=f"hello_{i}_{'a' * 230}") for
i in range(900)]
chain(*first_set)
last_task_in_first_set = first_set[-1]
chain(
last_task_in_first_set,
[LongEmptyOperator(task_id=f"world_{i}_{'a' * 230}") for i in range(900)]
)
chain(
last_task_in_first_set,
[LongEmptyOperator(task_id=f"this_{i}_{'a' * 230}") for i in range(900)]
)
chain(last_task_in_first_set,
[LongEmptyOperator(task_id=f"is_{i}_{'a' * 230}") for i in range(900)])
chain(
last_task_in_first_set,
[LongEmptyOperator(task_id=f"silly_{i}_{'a' * 230}") for i in range(900)]
)
chain(
last_task_in_first_set,
[LongEmptyOperator(task_id=f"stuff_{i}_{'a' * 230}") for i in range(900)]
)
```
serializing it can take 2.7GB
```
root@a24bae3584cb:/opt/airflow# pytest --memray
tests/providers/openlineage/utils/test_utils.py::test_get_dag_tree_large_dag
===========================================================================================================================================================================
test session starts
============================================================================================================================================================================
platform linux -- Python 3.12.5, pytest-8.3.2, pluggy-1.5.0 --
/usr/local/bin/python
cachedir: .pytest_cache
rootdir: /opt/airflow
configfile: pyproject.toml
plugins: memray-1.7.0, timeouts-1.2.1, icdiff-0.9, mock-3.14.0,
rerunfailures-14.0, requests-mock-1.12.1, xdist-3.6.1, asyncio-0.23.8,
anyio-4.4.0, instafail-0.5.0, cov-5.0.0, time-machine-2.15.0,
custom-exit-code-0.3.0
asyncio: mode=Mode.STRICT
setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s
collected 1 item
tests/providers/openlineage/utils/test_utils.py::test_get_dag_tree_large_dag
PASSED
[100%]
==============================================================================================================================================================================
MEMRAY REPORT
===============================================================================================================================================================================
Allocation results for
tests/providers/openlineage/utils/test_utils.py::test_get_dag_tree_large_dag at
the high watermark
📦 Total memory allocated: 5.4GiB
📏 Total allocations: 23
📊 Histogram of allocation sizes: |▁▁█ |
🥇 Biggest allocating functions:
-
_safe_get_dag_tree_view:/opt/airflow/airflow/providers/openlineage/utils/utils.py:446
-> 2.7GiB
- get_tree_view:/opt/airflow/airflow/models/dag.py:2445 ->
2.7GiB
- __setattr__:/opt/airflow/airflow/models/baseoperator.py:1191
-> 1.3MiB
- __setattr__:/opt/airflow/airflow/models/baseoperator.py:1191
-> 1.3MiB
- __setattr__:/opt/airflow/airflow/models/baseoperator.py:1191
-> 1.3MiB
===================================================================================================================================================================
Warning summary. Total: 3, Unique: 3
===================================================================================================================================================================
airflow: total 1, unique 1
collect: total 1, unique 1
other: total 2, unique 2
collect: total 2, unique 2
Warnings saved into /opt/airflow/tests/warnings.txt file.
============================================================================================================================================================================
1 passed in 8.60s
=============================================================================================================================================================================
```
https://github.com/apache/airflow/pull/41494
### What you think should happen instead?
I think tree_view format should be changed to one that does not require
extraordinary amount of whitespace in deeply nested cases.
Would be good to know in which cases it's being used though.
### How to reproduce
You can use above dag.
### Operating System
Docker/breeze on MacOS
### Versions of Apache Airflow Providers
_No response_
### Deployment
Other
### Deployment details
_No response_
### Anything else?
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]