potiuk commented on PR #25101: URL: https://github.com/apache/airflow/pull/25101#issuecomment-1186580863
Just to answer the question - Well, Why not actually? I am not sure if I am right or wrong but I could argue this way: You have Queues (and default queue size) that already defines the "resource" usage). What's even more you can mark some more heavy tasks with taking more slots in the queue (so for example if your task uses 4 CPUs it can take 4 slots in the queue). Queues are really the way to define the "resource" binding of each task (also because you can have different queues and each queue can be bound to different resurces or even different executor (CeleryKubernetes)., Parallelism is different. It tells scheduler to stop scheduling (thus managing) new tasks if more than X tasks are already running and are controlled by the executor that is used by the scheduler. So what it accounrs for is extra effort needed by scheduler to manage and control more runing tasks, not how many resources they take. And it makes sense to make it per-scheduler as it is "scheduler resource" bound rather than "worker resource" bound. Maybe the names could be different, but I believe that was the original intention of why it was implemented like that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
