o-nikolas commented on PR #34381:
URL: https://github.com/apache/airflow/pull/34381#issuecomment-1797017313

   
   > I'd love to use Airflow's official ECS Executor instead of having to 
support my own, but the current implementation is not suitable for my use case 
(and likely others).
   
   Hey thanks for the feedback! We're working on adding more features to this 
executor and we welcome any PRs for code changes that you've made which you 
find are working well for you and your organization :)
   
   > ### Supporting Capacity Providers
   > 
   > The ECS Executor in this PR does not support Capacity Providers. It 
requires that users specify a 
[LaunchType](https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_RunTask.html#ECS-RunTask-request-launchType),
 and if you're specifying a LaunchType then you cannot specify a Capacity 
Provider.
   
   This is a great request, it should be a good first issue, I'll cut a Github 
Issue for it. If you have code for it, feel free to submit a PR
   
   
   > The executor config [is scoped to 
overrides.containerOverrides](https://github.com/Joffreybvn/airflow/commit/8518785ea1078552d1d2bffe4543b927f67f030d#diff-127473a028d510b65de4dd84962b31ab703fbd821f9e871dd296b5c835ee3eebR264-R282).
 However there are relevant properties outside of 
`overrides.containerOverrides` that users may want to change.
   > 
   > For example, our ECS Cluster is actually composed of 3 capacity providers: 
A General-Purpose Capacity Provider (which is our cluster's default provider 
and runs on M7g instances), Memory-Optimized (R7g instances) and 
Compute-Optimized (C7g instances). My version of the ECS Executor allows users 
to set the appropriate Capacity Provider via the operator's `executor_config` 
param so that we can run our jobs in the most cost-efficient environment.
   
   This is also a good request, we should make it while the executor is in 
Experimental mode and we can still change that behaviour easily. I'll cut a 
Github Issue for it. If you have code for it, feel free to submit a PR
   
   > ### Adopting Task Instances
   > 
   > This is a must-have for us, as our deployments replace our scheduler 
instances.
   > 
   > There is a [PR](https://github.com/aelzeiny/airflow-aws-executors/pull/15) 
for that feature in @aelzeiny's executor. I had to make some changes to get 
that working properly. I can assist on a PR for this feature.
   
   Would definitely appreciate a PR!
   
   > ### Increasing Throughput
   > 
   > The ECS Executor calls the ECS RunTask API sequentially. On our current 
environment, this leads to a maximum throughput of roughly 4 tasks launched per 
second per scheduler instance. This can cause issues for larger airflow 
environments like my own, for example:
   > 
   >     * During peak times tasks often spend a long period in the scheduled 
state despite there being available capacity in the environment.
   > 
   >     * Larger values for `max_tis_per_query` can lead to missed heartbeats 
from the length of time the Executor is spent calling the RunTask API.
   > 
   > 
   > I haven't had a chance to implement an improvement for this in my own 
executor yet, but my thinking was to incorporate the same `sync_parallelism` 
logic that is currently used for the `CeleryExecutor`.
   
   Indeed some performance tuning can be done with some of the nobs of Airflow. 
I personally have gotten some good results scheduling 500 tasks in a few 
minutes by increasing the max_tis_per_query and relaxing the scheduler 
heartbeat a little, as well as some other configs. But to get double digit 
tasks scheduled per second for those very very large scale deployments will 
indeed likely need some code changes to the executor (again PRs welcome 
:grinning:).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to