o-nikolas commented on PR #34381: URL: https://github.com/apache/airflow/pull/34381#issuecomment-1797017313
> I'd love to use Airflow's official ECS Executor instead of having to support my own, but the current implementation is not suitable for my use case (and likely others). Hey thanks for the feedback! We're working on adding more features to this executor and we welcome any PRs for code changes that you've made which you find are working well for you and your organization :) > ### Supporting Capacity Providers > > The ECS Executor in this PR does not support Capacity Providers. It requires that users specify a [LaunchType](https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_RunTask.html#ECS-RunTask-request-launchType), and if you're specifying a LaunchType then you cannot specify a Capacity Provider. This is a great request, it should be a good first issue, I'll cut a Github Issue for it. If you have code for it, feel free to submit a PR > The executor config [is scoped to overrides.containerOverrides](https://github.com/Joffreybvn/airflow/commit/8518785ea1078552d1d2bffe4543b927f67f030d#diff-127473a028d510b65de4dd84962b31ab703fbd821f9e871dd296b5c835ee3eebR264-R282). However there are relevant properties outside of `overrides.containerOverrides` that users may want to change. > > For example, our ECS Cluster is actually composed of 3 capacity providers: A General-Purpose Capacity Provider (which is our cluster's default provider and runs on M7g instances), Memory-Optimized (R7g instances) and Compute-Optimized (C7g instances). My version of the ECS Executor allows users to set the appropriate Capacity Provider via the operator's `executor_config` param so that we can run our jobs in the most cost-efficient environment. This is also a good request, we should make it while the executor is in Experimental mode and we can still change that behaviour easily. I'll cut a Github Issue for it. If you have code for it, feel free to submit a PR > ### Adopting Task Instances > > This is a must-have for us, as our deployments replace our scheduler instances. > > There is a [PR](https://github.com/aelzeiny/airflow-aws-executors/pull/15) for that feature in @aelzeiny's executor. I had to make some changes to get that working properly. I can assist on a PR for this feature. Would definitely appreciate a PR! > ### Increasing Throughput > > The ECS Executor calls the ECS RunTask API sequentially. On our current environment, this leads to a maximum throughput of roughly 4 tasks launched per second per scheduler instance. This can cause issues for larger airflow environments like my own, for example: > > * During peak times tasks often spend a long period in the scheduled state despite there being available capacity in the environment. > > * Larger values for `max_tis_per_query` can lead to missed heartbeats from the length of time the Executor is spent calling the RunTask API. > > > I haven't had a chance to implement an improvement for this in my own executor yet, but my thinking was to incorporate the same `sync_parallelism` logic that is currently used for the `CeleryExecutor`. Indeed some performance tuning can be done with some of the nobs of Airflow. I personally have gotten some good results scheduling 500 tasks in a few minutes by increasing the max_tis_per_query and relaxing the scheduler heartbeat a little, as well as some other configs. But to get double digit tasks scheduled per second for those very very large scale deployments will indeed likely need some code changes to the executor (again PRs welcome :grinning:). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
