HI all, I have updated the AIP
* JavaCoordinator will be distributed with Task SDK in airflow.sdk.coordinators * Coordinator configurations are specified as work-side-only * Non-goal and a consideration section on Java code deployment * Consideration sections on experimental status, process model, and tooling integration (e.g. OL) I think I’ve addressed all previously mentioned points? Please tell if I missed anything. TP > On 14 May 2026, at 04:51, André Ahlert <[email protected]> wrote: > > Agree with the direction. Embedding in Task SDK avoids the matrix problem by > construction. > > One open point worth pinning before the starting: I believe that the AIP > marks the interface experimental but does not define what graduates it out. > Without that, "experimental" has no exit condition. > > Em qua., 13 de mai. de 2026 às 17:42, Jarek Potiuk <[email protected] > <mailto:[email protected]>> escreveu: >> I think yes - as long as we avoid matrix testing and a complicated release >> schedule, it's good. >> >> J >> >> >> On Wed, May 13, 2026 at 9:04 PM Tzu-ping Chung <[email protected] >> <mailto:[email protected]>> wrote: >> >> > OK, I think at this point we are both pretty determined on certain things >> > on this. Let’s see if there’s a compromise. >> > >> > The things I absolutely want are >> > >> > 1. The coordinator to be defined under airflow.sdk.coordinators and a >> > public interface. I don’t want the possibility of needing to change package >> > name in the future. >> > 2. The configuration to use the package path as a public interface. Not >> > mentioning the coordinator identifier would require each future coordinator >> > having its own configuration, or needing to change the configuration >> > format. Both require much deprecation work. >> > >> > I can stand the coordinator can be released as a part of Task SDK >> > initially (under the aforementioned package name). Even if it happens >> > earlier, user education shouldn’t be TOO bad…? As long as the import path >> > stays the same, this is a relatively simple fix in most deployments. The >> > worst case is we only split in Airflow 4. >> > >> > Would this be acceptable from your perspective Jarek? >> > >> > TP >> > >> > >> > On 14 May 2026, at 02:21, Jarek Potiuk <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > Well. in this case you have just one class. one coordinator and even fo >> > other (non-java) coordiatonr the "classpath" is wrong thing to say so. You >> > could achieve the same by not stating the classpath - but simply stating >> > which java interpreters to use. >> > >> > [sdk] >> > >> jdk_bridge = { >> > >> "jdk-11": { >> > >> "kwargs": {"java_executable": >> > "/usr/lib/jvm/java-11-openjdk/bin/java", "jars_root": ["/files/old/lib"]} >> > >> }, >> > >> "jdk-17": { >> > >> "kwargs": {"java_executable": >> > "/usr/lib/jvm/java-17-openjdk/bin/java", "jars_root": ["/files/new/lib"], >> > "jvm_args": ["-Xmx1024m"]} >> > >> } >> > >> } >> > >> > > I really don’t understand the desire to have the Java coordinator inside >> > the Task SDK distribution in the first go. The coordinator class must be >> > public in the worker (at least the import path), and putting it in the SDK >> > does not provide any more freedom to change it faster. It’s the contrary >> > because Task SDK releases require significantly more testing since the >> > distribution contains many things, while providers (in a similar position >> > to Airflow Core as coordinators to Task SDK) are released more frequently, >> > and can have major version bumps on their own if needed. >> > >> > If we agree that you want to release bugfixes for the task SDK >> > independently and faster, then yes, separate distribution might be a good >> > reason. But you need to solve the SDK's version coupling issue to make it >> > happen. >> > >> > his introduces operational complexities - depending on what kind of >> > version coupling you choose between SDK and coordinators. >> > >> > >> > What is the versioning and compatibility scheme you see? That will >> > significantly impact testing complexity and the release schedule - because >> > we will have to maintain a parallel release "train" for the "coordinator". >> > For example when a new SDK coordinator is released, it must work with >> > existing SDKs—imagine we have SDK 1.2.*, 1.3.*, 1.4.*, 1.5.*. Will the new >> > version of task-sdk be compatible? Should we add back-compat tests for all >> > those versions?) . And I am not even talking about intentionally breaking >> > the APIs, but unintentional bugs. Also if someone uses the new version of >> > the "SDK" but doesn't update the old version of the "Java Coordinator." >> > Will that continue to work? How do we ensure that? Are we going to test all >> > SDK versions with all "coordinator" versions? >> > >> > This is the operational complexity I am talking about. We already have >> > this for providers and it only works because we intentionally limited >> > back-compatibility and we run all those tests for older airflow versions >> > and we have "year" stable and proven BaseHook and BaseOperator API that has >> > not changed for years after it stabilized. >> > And we could limit that operational complexity - for example by coupling >> > minor versions. For example we say SDK 1.2.* Only works with coordinator >> > 1.2. *, SDK 1.3.* Only works with coordinator 1.3.* Assuming only bugfixes >> > are done in each. That also means that coordinator changes from main will >> > need to be cherry picked to v3_N_test and the faster releases of >> > coordinators will have to be released from v3_N_stable branch - that will >> > limit back-compat tests, but increases the development complexity - because >> > you will have to cherry-pick changes and have - potentially - independent >> > releases of coordinator 1_N from that branch where it will be tested with >> > Airflow 3.N and SDK 3.N. >> > >> > So we have those trade-offs: >> > >> > 1) Strict coupling (pinning) SDK version == Coordinator version: Slower >> > bug fix cycle, but no back-compat testing needed >> > 2) Coupling SDK MAJOR.MINOR = Coordinator MAJOR.MINOR => Faster bugfix >> > cycles, increased development/release complexity, leading to cherry-picking >> > to the v3_branch and separate releases for the coordinator from that >> > branch, back-compatibility testing is limited to that v3_N_test branch >> > 3) Free-fall: Any SDK works with Any Coordinator => faster bugfix cycles, >> > simpler releases and development (releases done from main) -> hugely >> > complex matrix of compatibility tests that might slow down testing even >> > more >> > >> > There is also a fourth option: what we do for providers which is "limited >> > free form." We deliberately limit "min_version" in providers and bump it >> > regularly to reduce our compatibility matrix size. >> > >> > Those are basically the three choices we have. I personally think option 1 >> > is best at this stage. We release the task-sdk with Airflow every month. >> > When needed, and if we find a critical bug we can do an ad-hoc release. >> > >> > Which one would you prefer - and do you also want to commit to maintaining >> > the associated development/testing complexity if it is not 1) ? >> > >> > J. >> > >> > >> > >> > >> > >> > >> > On Wed, May 13, 2026 at 7:07 PM Tzu-ping Chung via dev < >> > [email protected] <mailto:[email protected]>> wrote: >> > >> >> You can do the same if it’s in the task sdk, but >> >> >> >> 1. You need to use the same import path, but then you need to separately >> >> teach users to install a new package before moving it out. Not a very good >> >> user experience. >> >> >> >> 2. Or you use a different import path. You need to keep the old path >> >> working in the distribution for a long time *and* have users change their >> >> configs to fix the deprecation warning (and eventual breaking change). >> >> Unnecessary mental gymnastics on both sides. >> >> >> >> I really don’t understand the desire to have the Java coordinator inside >> >> the Task SDK distribution in the first go. The coordinator class must be >> >> public in the worker (at least the import path), and putting it in the SDK >> >> does not provide any more freedom to change it faster. It’s entirely the >> >> contrary since Task SDK releases require a lot more testing since the >> >> distribution contains many things, while providers (in a similar position >> >> to Airflow Core as coordinators to Task SDK) are released more frequently, >> >> and can have major version bumps on their own if needed. >> >> >> >> >> >> > On 14 May 2026, at 00:54, Jarek Potiuk <[email protected] >> >> > <mailto:[email protected]>> wrote: >> >> > >> >> > > You can create multiple instances of the same coordinator class. Pass >> >> appropriate arguments to suite your need. This is in the AIP. >> >> > >> >> > Yes. And you cand do exactly the same 1-1 if it's part of package and >> >> embedded in "airflow-sdk" distribution? Or am I wrong? Why do you think it >> >> would not be possible if it's part of task_sdk? >> >> > >> >> > On Wed, May 13, 2026 at 6:43 PM Tzu-ping Chung via dev < >> >> [email protected] <mailto:[email protected]> >> >> <mailto:[email protected] <mailto:[email protected]>>> wrote: >> >> >> You can create multiple instances of the same coordinator class. Pass >> >> appropriate arguments to suite your need. This is in the AIP. >> >> >> >> >> >> [sdk] >> >> >> coordinators = { >> >> >> "jdk-11": { >> >> >> "classpath": "airflow.sdk.coordinators.java.JavaCoordinator", >> >> >> "kwargs": {"java_executable": >> >> "/usr/lib/jvm/java-11-openjdk/bin/java", "jars_root": ["/files/old/lib"]} >> >> >> }, >> >> >> "jdk-17": { >> >> >> "classpath": "airflow.sdk.coordinators.java.JavaCoordinator", >> >> >> "kwargs": {"java_executable": >> >> "/usr/lib/jvm/java-17-openjdk/bin/java", "jars_root": ["/files/new/lib"], >> >> "jvm_args": ["-Xmx1024m"]} >> >> >> } >> >> >> } >> >> >> >> >> >> The problem is, classpath points to a class, so whatever this string >> >> is needs to be kept compatible in future releases. >> >> >> >> >> >> >> >> >> >> >> >> > On 13 May 2026, at 23:59, Jarek Potiuk <[email protected] >> >> >> > <mailto:[email protected]> <mailto: >> >> [email protected] <mailto:[email protected]>>> wrote: >> >> >> > >> >> >> > > How do you make it changeable any time? User needs to be able to >> >> specify what coordinator to use in the config, and you can’t break that >> >> later. >> >> >> > We have only *jdk* coordinator now . So I will revert the question. >> >> How are you going to configure two "jdk" `coordinators` when you have >> >> separate distributions running "java"? Are you planning to install two >> >> "coordinator-jdk" packages? This isn't possible in Python unless you build >> >> almost the same package with jdk-11, jdk-19 built in? >> >> >> > My understanding is that you will have configuration options to >> >> choose between "jdk-11" and "jdk-19." This "jdk" package of yours will >> >> simply have a list of "jdks" linking to the Java interpreters. >> >> >> > >> >> >> > So, it doesn't matter if it's a single "coordinator-jdk" package or >> >> everything in "airflow.sdk._coordinator."jdk" package or >> >> "airflow.sdk._bridge.jdk" package in the task-sdk. Regardless, you cannot >> >> install two "jdk" packages, whether they are separate distributions or if >> >> the package is in "task-sdk" and you have to configure which of the "jdk" >> >> bridges you want to use. >> >> >> > Yes. Sometime later, when we also have Go/TypeScript or other >> >> languages, we might decide to centralize some APIs, create a "true" >> >> coordinator package and separate distributions. And splitting into >> >> different packages will be absolutely no problem then. Nothing will stop >> >> us >> >> from doing it. >> >> >> > >> >> >> > It will save a lot of time for all the "distribution" issue - >> >> including releases, packagig, CI and everything connected - and it does >> >> not >> >> absolutely block us from further split later. >> >> >> > >> >> >> > J. >> >> >> > >> >> >> > >> >> >> > >> >> >> > On Wed, May 13, 2026 at 3:47 PM Tzu-ping Chung via dev < >> >> [email protected] <mailto:[email protected]> >> >> <mailto:[email protected] <mailto:[email protected]>> <mailto: >> >> [email protected] <mailto:[email protected]> >> >> <mailto:[email protected] <mailto:[email protected]>>>> wrote: >> >> >> >> >> >> >> >> >> >> >> >> > On 13 May 2026, at 20:42, Jarek Potiuk <[email protected] >> >> >> >> > <mailto:[email protected]> <mailto: >> >> [email protected] <mailto:[email protected]>> <mailto:[email protected] >> >> <mailto:[email protected]> <mailto:[email protected] >> >> <mailto:[email protected]>>>> >> >> wrote: >> >> >> >> > >> >> >> >> > Not really. I proposed an internal package that can be changed >> >> **any time**. Users aren't supposed to use those items. We can clearly >> >> mark >> >> them with "_" and also describe them thoroughly in the public API >> >> documentation. And no. Initilaly providers were **not** in arflow at all - >> >> you started from step 2. Step 1 is that they were added at some point in >> >> time long before my time. Hooks and operators as "API" were creaed quite >> >> early in the concept of Airflow - and the first implementations were added >> >> then. Then, after common patterns emerged, those hooks and operators were >> >> grouped into providers (they were not initially) and only moved out after >> >> quite some time. As I see it - you even admit yourself that things will >> >> look differently for different languages, and maybe even we will not need >> >> bridges for some of them at all. So why should we introduce new concept if >> >> we know currrently that it applies only to "JDK"? I fail to see why we >> >> should proceed if we already know the patterns are unlikely to be reusable >> >> in their current form. >> >> >> >> >> >> >> >> How do you make it changeable any time? User needs to be able to >> >> specify what coordinator to use in the config, and you can’t break that >> >> later. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------------------------------------------- >> >> >> >> To unsubscribe, e-mail: [email protected] >> >> >> >> <mailto:[email protected]> <mailto: >> >> [email protected] >> >> <mailto:[email protected]>> <mailto: >> >> [email protected] >> >> <mailto:[email protected]> <mailto: >> >> [email protected] >> >> <mailto:[email protected]>>> >> >> >> >> For additional commands, e-mail: [email protected] >> >> >> >> <mailto:[email protected]> >> >> <mailto:[email protected] <mailto:[email protected]>> >> >> <mailto:[email protected] <mailto:[email protected]> >> >> <mailto:[email protected] <mailto:[email protected]>>> >> >> >> >> >> > >> > >> >
