xBis7 commented on PR #54103:
URL: https://github.com/apache/airflow/pull/54103#issuecomment-3281650048
> That gave me a testing idea. I'm going to load 100 dags with at least 500
tasks each on the db and then re-capture the metrics to see how that affects
the loop's performance.
I added 99 dags with 1000 tasks and each dag had 1 more task than the
previous, e.g. 1001, 1002, 1003, ..., 1099. The scheduler had roughly 100.000
tasks to go over from the db.
I triggered the same 6 dags as before and the numbers were pretty much the
same.
* total number of scheduler iterations
* with the patch,
* try 1: 517 iterations
* try 2: 550 iterations
* try 3: 542 iterations
* original code,
* try 1: 1445 iterations
* try 2: 1507 iterations
* try 3: 1463 iterations
* total time
* with the patch,
* try 1: 514.31 s
* try 2: 498.51 s
* try 3: 500.37 s
* original code,
* try 1: 792.8 s
* try 2: 811.82 s
* try 3: 798.96 s
* average time per iteration
* with the patch,
* try 1: 0.99 s
* try 2: 0.9 s
* try 3: 0.92 s
* original code,
* try 1: 0.54 s
* try 2: 0.538 s
* try 3: 0,546 s
I ran the test 3 times and as you can see there wasn't much deviation in the
timings.
<img width="2499" height="1084" alt="perf2"
src="https://github.com/user-attachments/assets/1cca83a5-39a9-4683-a606-484142f4ab0c"
/>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]