Hi Airflow developers, I’ve been studying the Airflow *Task SDK* in detail, and I find its direction very interesting—especially the idea of introducing a stable, user-facing API layer that is decoupled from the internal executor, scheduler, and runtime behavior.
While going through the design notes and recent changes around the Task SDK, it reminded me of the architectural philosophy behind *Apache Spark Connect*, which also emphasizes: - separating user-facing APIs from the underlying execution engine - providing a stable long-term public API surface - enabling flexible execution models - reducing coupling between API definitions and the actual runtime environment This made me wonder whether the philosophical direction is similar or if I am drawing an incorrect analogy. I would like to ask a few questions to better understand Airflow’s long-term intent: ------------------------------ *Q1.* Is the Task SDK intentionally aiming for a form of *API–engine decoupling* similar to Spark Connect? Or is the motivation fundamentally different? *Q2.* Is the long-term vision that tasks will be defined through a stable Task SDK interface while the underlying scheduler/executor implementations evolve independently without breaking user code? *Q3.* *https://issues.apache.org/jira/browse/SPARK-39375 <https://issues.apache.org/jira/browse/SPARK-39375> # spark-connect* >From the perspective of the Airflow dev community, does it make sense to compare Task SDK ↔ Spark Connect, or is the architectural direction of Airflow fundamentally different? ------------------------------ I’m asking these questions because I want to *better understand the philosophy that Airflow is trying to pursue*, and confirm whether my interpretation of the Task SDK direction is accurate. Any insights or clarifications would be greatly appreciated. Thank you for your continued work on Airflow. Best regards, *Kyungjun Lee*
