shivaam commented on PR #64326:
URL: https://github.com/apache/airflow/pull/64326#issuecomment-4146318780

   Nice. Seems like a real production bug. A few thoughts:
   1. Default of 512 may be too low. The scheduler processes all active DAGs 
every cycle. With 1000+ DAGs, a 512 cache means constant eviction and 
re-fetching from the DB on every loop. The API server's Execution API also 
serves worker requests for every task state transition, so it can accumulate 
entries fast too. Consider starting higher (2048+) and letting people tune down 
— it's easier to reduce a known number than to discover you need to increase 
one you didn't know existed.
   2. A single config for both scheduler and API server may not be ideal. The 
scheduler's working set is bounded (latest version per active DAG) and 
performance-sensitive — it needs a cache big enough to hold all active DAGs. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to