Final call for comments! I plan to start a vote thread before the end of the week.
TP > On 14 May 2026, at 17:19, Tzu-ping Chung <[email protected]> wrote: > > HI all, > > I have updated the AIP > > * JavaCoordinator will be distributed with Task SDK in > airflow.sdk.coordinators > * Coordinator configurations are specified as work-side-only > * Non-goal and a consideration section on Java code deployment > * Consideration sections on experimental status, process model, and tooling > integration (e.g. OL) > > I think I’ve addressed all previously mentioned points? Please tell if I > missed anything. > > TP > > >> On 14 May 2026, at 04:51, André Ahlert <[email protected]> wrote: >> >> Agree with the direction. Embedding in Task SDK avoids the matrix problem by >> construction. >> >> One open point worth pinning before the starting: I believe that the AIP >> marks the interface experimental but does not define what graduates it out. >> Without that, "experimental" has no exit condition. >> >> Em qua., 13 de mai. de 2026 às 17:42, Jarek Potiuk <[email protected] >> <mailto:[email protected]>> escreveu: >>> I think yes - as long as we avoid matrix testing and a complicated release >>> schedule, it's good. >>> >>> J >>> >>> >>> On Wed, May 13, 2026 at 9:04 PM Tzu-ping Chung <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> > OK, I think at this point we are both pretty determined on certain things >>> > on this. Let’s see if there’s a compromise. >>> > >>> > The things I absolutely want are >>> > >>> > 1. The coordinator to be defined under airflow.sdk.coordinators and a >>> > public interface. I don’t want the possibility of needing to change >>> > package >>> > name in the future. >>> > 2. The configuration to use the package path as a public interface. Not >>> > mentioning the coordinator identifier would require each future >>> > coordinator >>> > having its own configuration, or needing to change the configuration >>> > format. Both require much deprecation work. >>> > >>> > I can stand the coordinator can be released as a part of Task SDK >>> > initially (under the aforementioned package name). Even if it happens >>> > earlier, user education shouldn’t be TOO bad…? As long as the import path >>> > stays the same, this is a relatively simple fix in most deployments. The >>> > worst case is we only split in Airflow 4. >>> > >>> > Would this be acceptable from your perspective Jarek? >>> > >>> > TP >>> > >>> > >>> > On 14 May 2026, at 02:21, Jarek Potiuk <[email protected] >>> > <mailto:[email protected]>> wrote: >>> > >>> > Well. in this case you have just one class. one coordinator and even fo >>> > other (non-java) coordiatonr the "classpath" is wrong thing to say so. You >>> > could achieve the same by not stating the classpath - but simply stating >>> > which java interpreters to use. >>> > >>> > [sdk] >>> > >> jdk_bridge = { >>> > >> "jdk-11": { >>> > >> "kwargs": {"java_executable": >>> > "/usr/lib/jvm/java-11-openjdk/bin/java", "jars_root": ["/files/old/lib"]} >>> > >> }, >>> > >> "jdk-17": { >>> > >> "kwargs": {"java_executable": >>> > "/usr/lib/jvm/java-17-openjdk/bin/java", "jars_root": ["/files/new/lib"], >>> > "jvm_args": ["-Xmx1024m"]} >>> > >> } >>> > >> } >>> > >>> > > I really don’t understand the desire to have the Java coordinator inside >>> > the Task SDK distribution in the first go. The coordinator class must be >>> > public in the worker (at least the import path), and putting it in the SDK >>> > does not provide any more freedom to change it faster. It’s the contrary >>> > because Task SDK releases require significantly more testing since the >>> > distribution contains many things, while providers (in a similar position >>> > to Airflow Core as coordinators to Task SDK) are released more frequently, >>> > and can have major version bumps on their own if needed. >>> > >>> > If we agree that you want to release bugfixes for the task SDK >>> > independently and faster, then yes, separate distribution might be a good >>> > reason. But you need to solve the SDK's version coupling issue to make it >>> > happen. >>> > >>> > his introduces operational complexities - depending on what kind of >>> > version coupling you choose between SDK and coordinators. >>> > >>> > >>> > What is the versioning and compatibility scheme you see? That will >>> > significantly impact testing complexity and the release schedule - because >>> > we will have to maintain a parallel release "train" for the "coordinator". >>> > For example when a new SDK coordinator is released, it must work with >>> > existing SDKs—imagine we have SDK 1.2.*, 1.3.*, 1.4.*, 1.5.*. Will the new >>> > version of task-sdk be compatible? Should we add back-compat tests for all >>> > those versions?) . And I am not even talking about intentionally breaking >>> > the APIs, but unintentional bugs. Also if someone uses the new version of >>> > the "SDK" but doesn't update the old version of the "Java Coordinator." >>> > Will that continue to work? How do we ensure that? Are we going to test >>> > all >>> > SDK versions with all "coordinator" versions? >>> > >>> > This is the operational complexity I am talking about. We already have >>> > this for providers and it only works because we intentionally limited >>> > back-compatibility and we run all those tests for older airflow versions >>> > and we have "year" stable and proven BaseHook and BaseOperator API that >>> > has >>> > not changed for years after it stabilized. >>> > And we could limit that operational complexity - for example by coupling >>> > minor versions. For example we say SDK 1.2.* Only works with coordinator >>> > 1.2. *, SDK 1.3.* Only works with coordinator 1.3.* Assuming only bugfixes >>> > are done in each. That also means that coordinator changes from main will >>> > need to be cherry picked to v3_N_test and the faster releases of >>> > coordinators will have to be released from v3_N_stable branch - that will >>> > limit back-compat tests, but increases the development complexity - >>> > because >>> > you will have to cherry-pick changes and have - potentially - independent >>> > releases of coordinator 1_N from that branch where it will be tested with >>> > Airflow 3.N and SDK 3.N. >>> > >>> > So we have those trade-offs: >>> > >>> > 1) Strict coupling (pinning) SDK version == Coordinator version: Slower >>> > bug fix cycle, but no back-compat testing needed >>> > 2) Coupling SDK MAJOR.MINOR = Coordinator MAJOR.MINOR => Faster bugfix >>> > cycles, increased development/release complexity, leading to >>> > cherry-picking >>> > to the v3_branch and separate releases for the coordinator from that >>> > branch, back-compatibility testing is limited to that v3_N_test branch >>> > 3) Free-fall: Any SDK works with Any Coordinator => faster bugfix cycles, >>> > simpler releases and development (releases done from main) -> hugely >>> > complex matrix of compatibility tests that might slow down testing even >>> > more >>> > >>> > There is also a fourth option: what we do for providers which is "limited >>> > free form." We deliberately limit "min_version" in providers and bump it >>> > regularly to reduce our compatibility matrix size. >>> > >>> > Those are basically the three choices we have. I personally think option 1 >>> > is best at this stage. We release the task-sdk with Airflow every month. >>> > When needed, and if we find a critical bug we can do an ad-hoc release. >>> > >>> > Which one would you prefer - and do you also want to commit to maintaining >>> > the associated development/testing complexity if it is not 1) ? >>> > >>> > J. >>> > >>> > >>> > >>> > >>> > >>> > >>> > On Wed, May 13, 2026 at 7:07 PM Tzu-ping Chung via dev < >>> > [email protected] <mailto:[email protected]>> wrote: >>> > >>> >> You can do the same if it’s in the task sdk, but >>> >> >>> >> 1. You need to use the same import path, but then you need to separately >>> >> teach users to install a new package before moving it out. Not a very >>> >> good >>> >> user experience. >>> >> >>> >> 2. Or you use a different import path. You need to keep the old path >>> >> working in the distribution for a long time *and* have users change their >>> >> configs to fix the deprecation warning (and eventual breaking change). >>> >> Unnecessary mental gymnastics on both sides. >>> >> >>> >> I really don’t understand the desire to have the Java coordinator inside >>> >> the Task SDK distribution in the first go. The coordinator class must be >>> >> public in the worker (at least the import path), and putting it in the >>> >> SDK >>> >> does not provide any more freedom to change it faster. It’s entirely the >>> >> contrary since Task SDK releases require a lot more testing since the >>> >> distribution contains many things, while providers (in a similar position >>> >> to Airflow Core as coordinators to Task SDK) are released more >>> >> frequently, >>> >> and can have major version bumps on their own if needed. >>> >> >>> >> >>> >> > On 14 May 2026, at 00:54, Jarek Potiuk <[email protected] >>> >> > <mailto:[email protected]>> wrote: >>> >> > >>> >> > > You can create multiple instances of the same coordinator class. Pass >>> >> appropriate arguments to suite your need. This is in the AIP. >>> >> > >>> >> > Yes. And you cand do exactly the same 1-1 if it's part of package and >>> >> embedded in "airflow-sdk" distribution? Or am I wrong? Why do you think >>> >> it >>> >> would not be possible if it's part of task_sdk? >>> >> > >>> >> > On Wed, May 13, 2026 at 6:43 PM Tzu-ping Chung via dev < >>> >> [email protected] <mailto:[email protected]> >>> >> <mailto:[email protected] <mailto:[email protected]>>> wrote: >>> >> >> You can create multiple instances of the same coordinator class. Pass >>> >> appropriate arguments to suite your need. This is in the AIP. >>> >> >> >>> >> >> [sdk] >>> >> >> coordinators = { >>> >> >> "jdk-11": { >>> >> >> "classpath": "airflow.sdk.coordinators.java.JavaCoordinator", >>> >> >> "kwargs": {"java_executable": >>> >> "/usr/lib/jvm/java-11-openjdk/bin/java", "jars_root": ["/files/old/lib"]} >>> >> >> }, >>> >> >> "jdk-17": { >>> >> >> "classpath": "airflow.sdk.coordinators.java.JavaCoordinator", >>> >> >> "kwargs": {"java_executable": >>> >> "/usr/lib/jvm/java-17-openjdk/bin/java", "jars_root": ["/files/new/lib"], >>> >> "jvm_args": ["-Xmx1024m"]} >>> >> >> } >>> >> >> } >>> >> >> >>> >> >> The problem is, classpath points to a class, so whatever this string >>> >> is needs to be kept compatible in future releases. >>> >> >> >>> >> >> >>> >> >> >>> >> >> > On 13 May 2026, at 23:59, Jarek Potiuk <[email protected] >>> >> >> > <mailto:[email protected]> <mailto: >>> >> [email protected] <mailto:[email protected]>>> wrote: >>> >> >> > >>> >> >> > > How do you make it changeable any time? User needs to be able to >>> >> specify what coordinator to use in the config, and you can’t break that >>> >> later. >>> >> >> > We have only *jdk* coordinator now . So I will revert the question. >>> >> How are you going to configure two "jdk" `coordinators` when you have >>> >> separate distributions running "java"? Are you planning to install two >>> >> "coordinator-jdk" packages? This isn't possible in Python unless you >>> >> build >>> >> almost the same package with jdk-11, jdk-19 built in? >>> >> >> > My understanding is that you will have configuration options to >>> >> choose between "jdk-11" and "jdk-19." This "jdk" package of yours will >>> >> simply have a list of "jdks" linking to the Java interpreters. >>> >> >> > >>> >> >> > So, it doesn't matter if it's a single "coordinator-jdk" package or >>> >> everything in "airflow.sdk._coordinator."jdk" package or >>> >> "airflow.sdk._bridge.jdk" package in the task-sdk. Regardless, you cannot >>> >> install two "jdk" packages, whether they are separate distributions or if >>> >> the package is in "task-sdk" and you have to configure which of the "jdk" >>> >> bridges you want to use. >>> >> >> > Yes. Sometime later, when we also have Go/TypeScript or other >>> >> languages, we might decide to centralize some APIs, create a "true" >>> >> coordinator package and separate distributions. And splitting into >>> >> different packages will be absolutely no problem then. Nothing will stop >>> >> us >>> >> from doing it. >>> >> >> > >>> >> >> > It will save a lot of time for all the "distribution" issue - >>> >> including releases, packagig, CI and everything connected - and it does >>> >> not >>> >> absolutely block us from further split later. >>> >> >> > >>> >> >> > J. >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > On Wed, May 13, 2026 at 3:47 PM Tzu-ping Chung via dev < >>> >> [email protected] <mailto:[email protected]> >>> >> <mailto:[email protected] <mailto:[email protected]>> <mailto: >>> >> [email protected] <mailto:[email protected]> >>> >> <mailto:[email protected] <mailto:[email protected]>>>> wrote: >>> >> >> >> >>> >> >> >> >>> >> >> >> > On 13 May 2026, at 20:42, Jarek Potiuk <[email protected] >>> >> >> >> > <mailto:[email protected]> <mailto: >>> >> [email protected] <mailto:[email protected]>> <mailto:[email protected] >>> >> <mailto:[email protected]> <mailto:[email protected] >>> >> <mailto:[email protected]>>>> >>> >> wrote: >>> >> >> >> > >>> >> >> >> > Not really. I proposed an internal package that can be changed >>> >> **any time**. Users aren't supposed to use those items. We can clearly >>> >> mark >>> >> them with "_" and also describe them thoroughly in the public API >>> >> documentation. And no. Initilaly providers were **not** in arflow at all >>> >> - >>> >> you started from step 2. Step 1 is that they were added at some point in >>> >> time long before my time. Hooks and operators as "API" were creaed quite >>> >> early in the concept of Airflow - and the first implementations were >>> >> added >>> >> then. Then, after common patterns emerged, those hooks and operators were >>> >> grouped into providers (they were not initially) and only moved out after >>> >> quite some time. As I see it - you even admit yourself that things will >>> >> look differently for different languages, and maybe even we will not need >>> >> bridges for some of them at all. So why should we introduce new concept >>> >> if >>> >> we know currrently that it applies only to "JDK"? I fail to see why we >>> >> should proceed if we already know the patterns are unlikely to be >>> >> reusable >>> >> in their current form. >>> >> >> >> >>> >> >> >> How do you make it changeable any time? User needs to be able to >>> >> specify what coordinator to use in the config, and you can’t break that >>> >> later. >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> --------------------------------------------------------------------- >>> >> >> >> To unsubscribe, e-mail: [email protected] >>> >> >> >> <mailto:[email protected]> <mailto: >>> >> [email protected] >>> >> <mailto:[email protected]>> <mailto: >>> >> [email protected] >>> >> <mailto:[email protected]> <mailto: >>> >> [email protected] >>> >> <mailto:[email protected]>>> >>> >> >> >> For additional commands, e-mail: [email protected] >>> >> >> >> <mailto:[email protected]> >>> >> <mailto:[email protected] >>> >> <mailto:[email protected]>> >>> >> <mailto:[email protected] <mailto:[email protected]> >>> >> <mailto:[email protected] >>> >> <mailto:[email protected]>>> >>> >> >> >> >>> > >>> > >>> > >
