mobuchowski opened a new issue, #41505:
URL: https://github.com/apache/airflow/issues/41505

   ### Apache Airflow version
   
   2.10.0rc1
   
   ### If "Other Airflow 2 version" selected, which one?
   
   _No response_
   
   ### What happened?
   
   `get_tree_view` in degenerated case can take a lot of memory.
   
   For a DAG 
   
   ```
       with DAG("aaa_big_get_tree_view", schedule=None) as dag:
           first_set = [LongEmptyOperator(task_id=f"hello_{i}_{'a' * 230}") for 
i in range(900)]
           chain(*first_set)
   
           last_task_in_first_set = first_set[-1]
   
           chain(
               last_task_in_first_set, 
[LongEmptyOperator(task_id=f"world_{i}_{'a' * 230}") for i in range(900)]
           )
   
           chain(
               last_task_in_first_set, 
[LongEmptyOperator(task_id=f"this_{i}_{'a' * 230}") for i in range(900)]
           )
   
           chain(last_task_in_first_set, 
[LongEmptyOperator(task_id=f"is_{i}_{'a' * 230}") for i in range(900)])
   
           chain(
               last_task_in_first_set, 
[LongEmptyOperator(task_id=f"silly_{i}_{'a' * 230}") for i in range(900)]
           )
   
           chain(
               last_task_in_first_set, 
[LongEmptyOperator(task_id=f"stuff_{i}_{'a' * 230}") for i in range(900)]
           )
   ```
   
   serializing it can take 2.7GB
   
   ```
   root@a24bae3584cb:/opt/airflow# pytest --memray 
tests/providers/openlineage/utils/test_utils.py::test_get_dag_tree_large_dag
   
===========================================================================================================================================================================
 test session starts 
============================================================================================================================================================================
   platform linux -- Python 3.12.5, pytest-8.3.2, pluggy-1.5.0 -- 
/usr/local/bin/python
   cachedir: .pytest_cache
   rootdir: /opt/airflow
   configfile: pyproject.toml
   plugins: memray-1.7.0, timeouts-1.2.1, icdiff-0.9, mock-3.14.0, 
rerunfailures-14.0, requests-mock-1.12.1, xdist-3.6.1, asyncio-0.23.8, 
anyio-4.4.0, instafail-0.5.0, cov-5.0.0, time-machine-2.15.0, 
custom-exit-code-0.3.0
   asyncio: mode=Mode.STRICT
   setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s
   collected 1 item
   
   tests/providers/openlineage/utils/test_utils.py::test_get_dag_tree_large_dag 
PASSED                                                                          
                                                                                
                                                                                
                                        [100%]
   
   
   
==============================================================================================================================================================================
 MEMRAY REPORT 
===============================================================================================================================================================================
   Allocation results for 
tests/providers/openlineage/utils/test_utils.py::test_get_dag_tree_large_dag at 
the high watermark
   
         📦 Total memory allocated: 5.4GiB
         📏 Total allocations: 23
         📊 Histogram of allocation sizes: |▁▁█  |
         🥇 Biggest allocating functions:
                - 
_safe_get_dag_tree_view:/opt/airflow/airflow/providers/openlineage/utils/utils.py:446
 -> 2.7GiB
                - get_tree_view:/opt/airflow/airflow/models/dag.py:2445 -> 
2.7GiB
                - __setattr__:/opt/airflow/airflow/models/baseoperator.py:1191 
-> 1.3MiB
                - __setattr__:/opt/airflow/airflow/models/baseoperator.py:1191 
-> 1.3MiB
                - __setattr__:/opt/airflow/airflow/models/baseoperator.py:1191 
-> 1.3MiB
   
   
   
===================================================================================================================================================================
 Warning summary. Total: 3, Unique: 3 
===================================================================================================================================================================
   airflow: total 1, unique 1
     collect: total 1, unique 1
   other: total 2, unique 2
     collect: total 2, unique 2
   Warnings saved into /opt/airflow/tests/warnings.txt file.
   
============================================================================================================================================================================
 1 passed in 8.60s 
=============================================================================================================================================================================
   ```
   
   https://github.com/apache/airflow/pull/41494
   
   ### What you think should happen instead?
   
   I think tree_view format should be changed to one that does not require 
extraordinary amount of whitespace in deeply nested cases.
   
   Would be good to know in which cases it's being used though.
   
   ### How to reproduce
   
   You can use above dag.
   
   ### Operating System
   
   Docker/breeze on MacOS
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to