alex-astronomer opened a new issue, #23786:
URL: https://github.com/apache/airflow/issues/23786

   ### Apache Airflow version
   
   2.3.0 (latest released)
   
   ### What happened
   
   The web UI is slow to load the default (grid) view for DAGs when there are 
mapped tasks with a high number of expansions.
   
   I did some testing with DAGs that have a variable number of tasks, along 
with changing the webserver resources to see how this affects the load times.
   
   Here is a graph showing that testing.  Let me know if you have any other 
questions about this.
   
   <img width="719" alt="image" 
src="https://user-images.githubusercontent.com/89415310/169157983-6c04bf7a-5e49-4557-ae62-6905c2889e95.png";>
   
   My findings based on what I'm seeing here:
   
   The jump from 5->10 AUs makes a difference but 10 to 20 does not make a 
difference.  There are diminishing returns when bumping up the webserver 
resources which leads me to believe that this could be a factor of database 
performance after the webserver is scaled to a certain point.
   
   If we look at the graph on a log scale, it's almost perfectly linear for the 
10 and 20AU lines on the plot.  This leads me to believe that the time that it 
takes to load is a direct function of the number of task expansions that we 
have for a mapped task.
   
   ### What you think should happen instead
   
   Web UI loads in a reasonable amount of time, anything less than 10 seconds 
would be acceptable relatively speaking with the performance that we're getting 
now, ideally somewhere under 2-3 second I think would be best, if possible.
   
   ### How to reproduce
   
   ```
   
   from datetime import datetime
   from airflow.models import DAG
   from airflow.operators.empty import EmptyOperator
   from airflow.operators.python import PythonOperator
   
   default_args = {
       'owner': 'airflow',
       'depends_on_past': False,
       'email_on_failure': False,
       'email_on_retry': False,
   }
   
   initial_scale = 7
   max_scale = 12
   scaling_factor = 2
   
   for scale in range(initial_scale, max_scale + 1):
       dag_id = f"dynamic_task_mapping_{scale}"
       with DAG(
           dag_id=dag_id,
           default_args=default_args,
           catchup=False,
           schedule_interval=None,
           start_date=datetime(1970, 1, 1),
           render_template_as_native_obj=True,
       ) as dag:
           start = EmptyOperator(task_id="start")
   
           mapped = PythonOperator.partial(
               task_id="mapped",
               python_callable=lambda m: print(m),
           ).expand(
               op_args=[[x] for x in list(range(2**scale))]
           )
   
           end = EmptyOperator(task_id="end")
   
           start >> mapped >> end
       globals()[dag_id] = dag
   
   
   ```
   
   ### Operating System
   
   Debian
   
   ### Versions of Apache Airflow Providers
   
   n/a
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to