kaxil commented on code in PR #63500: URL: https://github.com/apache/airflow/pull/63500#discussion_r2934450546
########## task-sdk/docs/deferred-vs-async-operators.rst: ########## @@ -0,0 +1,118 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +.. _sdk-deferred-vs-async-operators: + +Deferred vs Async Operators +=========================== + + .. versionadded:: 3.2.0 + +Airflow contains Python native async support, enabling task authors to leverage asynchronous I/O for high-throughput workloads. +It is important to understand how this differs from deferred operators. + +Deferred Operators +------------------ + +A deferred operator is an operator that can pause its execution until an external trigger event occurs, +without holding a worker slot. For more details see :doc:`authoring-and-scheduling/deferring`. +Examples include the HttpOperator in deferrable mode, sensors or operators integrated with triggers. + +Key characteristics: + + - Execution is paused while waiting for external events or resources. + - Worker slots are freed during the wait, improving resource efficiency. + - Ideal for scenarios where a single external event or a small number of events dictate task completion. + +Typically simpler to use, as no custom async logic is required as this is all handled by the deferred operator. + +Async Python Operators +---------------------- + +Python native async operators allow you to write tasks that leverage Python's asyncio: + + - Tasks can perform many concurrent I/O operations efficiently within a single worker slot sharing the same event loop. + - Task code uses async/await syntax with async-compatible hooks, such as HttpAsyncHook or the SFTPHookAsync. Review Comment: Two things: 1. **Missing `BaseAsyncOperator`**: This section only covers `@task async def` (taskflow API). Users writing class-based operators should know about `BaseAsyncOperator` (defined in `airflow.sdk`). At minimum a mention like: "For class-based operators, subclass `BaseAsyncOperator`." 2. **"sharing the same event loop"** (line 48): Each task instance gets its own event loop (via `asyncio.run()`). The concurrency is *within* that single task's loop, not shared across task instances. "sharing the same event loop" could mislead users into thinking multiple tasks share one loop. Consider: "using a single event loop within the task." -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
