Hi Tzu-ping,

This looks great. I suggest formalizing this with a more detailed
design—ideally an AIP accompanied by architecture and user documentation.
Following the "docs-first" discussions, these could be reverse-engineered
from your POC code and included in the initial "founding" PR.

Using agentic workflows to automate the "code + design + docs" sync would
be ideal here. This approach allows diverse stakeholders to review the
feature from different perspectives early on:

- Architectural: Reviewing system interactions via design docs.
- Implementation: Standard code reviews.
- User Experience: Testing teams can iterate on user docs to develop test
plans.

Given the broad architectural impact and the new user surface for Java
developers, this proposal is a perfect candidate for this automated,
documentation-led approach. It would also allow others to contribute
documentation directly to the PR as the implementation evolves.

Best,
Jarek

On Mon, Apr 27, 2026 at 8:09 AM Tzu-ping Chung via dev <
[email protected]> wrote:

> Hi all,
>
> As mentioned in the latest dev call, we have been developing a Java SDK
> with changes to Airflow in a separate fork[1]. We plan to start merging the
> Java SDK work back into the OSS repository.
>
> We see this as a natural step following initial work in AIP-72[2], which
> created “a clean language agnostic interface for task execution, with
> support for multiple language bindings” (quoted from the proposal).
>
> The Java SDK also uses Ash’s addition of @task.stub[3] for the Go SDK, to
> declare a task in a DAG to be “implemented elsewhere” (not in the annotated
> function). Similar to the Go SDK, we also created a Java library that users
> can use to write task implementations for Airflow to execute at runtime.
>
> [1]: https://github.com/astronomer/airflow/tree/feature/java-all
> [2]: https://cwiki.apache.org/confluence/x/xgmTEg
> [3]: https://github.com/apache/airflow/pull/56055
>
> The user-facing syntax for a stub task would be the same as implemented by
> the Go SDK:
>
>     @task.stub(queue="java-tasks")
>     def my_task(): ...
>
> With a new configuration option to map tasks in a pool to be executed by a
> specific SDK:
>
>     [sdk]
>     queue_to_sdk = {"java-tasks": "java"}
>
> The configuration is needed for some executors the Go SDK currently does
> not support. The Go SDK currently relies on each executor worker process to
> specify which queues they listen to, but this is not always viable, since
> some executors—LocalExecutor, for example—do not have the concept of worker
> processes.
>
> The Coordinator Layer
> =====================
>
> When the Go SDK was implemented, it left out Runtime Airflow plugins as a
> future topic. This includes custom XCom backends, secrets backends lookup
> for connections and variables, etc. These components are implemented in
> Python, and a Java task cannot easily use the feature unless we also
> implement the lookup logic in Java. We don’t want to do that since it
> introduces significant overhead to writing plugins, and the overhead
> multiplies with each new language SDK.
>
> Fortunately, the current execution-time task runner already uses a
> two-layer design. When an executor wants to run a task, it starts a
> (Python) task runner process that talks to Airflow Core through the
> Execution API, and *forks* another (Python) process, which talks to the
> task runner through TCP, to run the actual task code. Airflow plugins
> simply go into the task runner process.
>
> This design works well for us since it keeps all the Airflow plugins in
> Python. The only thing missing is an abstraction for the task runner
> process to run tasks in any language. We are calling this new layer the
> **Coordinator**.
>
> When a DAG bundle is loaded, it not only tells Airflow how to find the
> DAGs (and the tasks in them), but also how to *run* each task. Current
> Python tasks use the Python Coordinator, running tasks by forking as
> previously described. A new JVM Coordinator will instruct the task runner
> how to run tasks packaged in JAR files.
>
> Each coordinator implements a base interface (BaseRuntimeCoordinator) that
> handles three concerns:
>
> - Discovery: determining whether a given file belongs to this coordinator
> (e.g. JAR files for Java).
> - DAG parsing: returning a runtime-specific subprocess command to parse
> DAG files in the target language.
> - Task execution: returning a runtime-specific subprocess command to
> execute tasks in the target runtime.
>
> The base class owns the full bridge lifecycle—TCP servers, subprocess
> management, and cleanup—so language providers only need to implement these
> three methods.
>
> The coordinator translates a DagFileParseRequest (for DAG parsing) and
> StartupDetails (for Task execution) data model (as declared in Airflow)
> into the appropriate commands for the target runtime. For example, a “java
> -classpath ... /path/to/MainClass ...” subprocess command that points to
> the correct JAR file and main class in this case.
>
> Coordinators as Airflow Providers
> =================================
>
> The base coordinator interface and the Python coordinator will live in
> “airflow.sdk.execution_time”. Other coordinators (for foreign languages)
> are registered through the existing Airflow provider mechanism. Each SDK
> provider declares its coordinator in its provider.yaml under a
> “coordinators” extension point. Both ProvidersManager (airflow-core) and
> ProvidersManagerTaskRuntime (task-sdk) discover coordinators through this
> extension point. This means adding a new language runtime requires only a
> provider package. No changes to Airflow Core are needed.
>
> The new JVM-based coordinator will live in the namespace
> “airflow.providers.sdk.java”. This is not the most accurate name
> (technically it should be “jvm” instead), but in practice most users will
> recognize it, and (from my understanding) other JVM language users (e.g.
> Kotlin, Scala) are already well-versed enough dealing with Java
> interoperability to understand “java” means JVM in this context.
>
> Writing DAGs in Java
> ====================
>
> This is not strictly connected to AIP-72, but considered by us as a
> natural next step since we can now implement tasks in a foreign language.
> Being able to define the DAG in the same language as the task
> implementation is useful since writing Python, even if only with minimal
> syntax, is still a hurdle for those not already familiar with, or even
> allowed to run it. There are mainly three things we need on top of the task
> implementation interface:
>
> - DAG flags (e.g. schedule, max_active_tasks)
> - Task flags (e.g. trigger_rule, weight_rule)
> - Task dependencies
>
> A proof-of-concept implementation is included with other changes proposed
> elsewhere in this document.
>
> Lazy Consensus Topics
> =====================
>
> We’re calling for lazy consensus for the following topics
>
> - A new “queue_to_sdk” configuration option to route tasks to a specific
> language SDK
> - A new coordinator layer in the SDK to route implementations at execution
> time.
> - New providers under airflow.providers.sdk to provide additional language
> support.
> - Develop the Go SDK to support the proposed model and a provider package
> for the coordinator. (Existing features stay as-is; no breaking changes.)
> - Add the new Java SDK and the corresponding provider package.
>
> TP
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to