TLDR: A +1 to the proposal from Tiago for the release, with the addition of some short term recommendation (on how to revert some of the temporary changes) and some perspective on a potential alternative to consider for the long term
* Short term: The plan from Tiago describes a strategy that appears to be able to solve the build cycle issues we have, allowing us to proceed with the 10.0 release. We do realize that some of the changes that are being done to be able to do the 10.0 release are going to be temporary. Therefore, as part of this proposal, I urge the team to also document how we are going to revert some of these temporary changes immediately after the release (*). More specifically, my recommendation is that we agree that the images and operator folder from kie-tools will be removed again and development will continue on the existing repositories. But let’s discuss if people see this differently or if there might be other steps. The advantage of this approach would be that it allows us to move forward with the release, does buy us time to find a consensus on the long-term solution and minimizes the impact on the developers regarding temporary solutions. And it also requires us to find this consensus before the next release. * Longer term: as discussed to some degree in this thread already, there seems to be an alternative to explore where we define more strict boundaries (for dependencies) between repositories, and create a build chain where images and operator are built after tools. That said, it’s fair to say that this proposal needs to be worked out and validated more, and initial assessments on the effort related to this, if we don’t want to rush into this and do things right, are indicating this might take multiple months. We also need to discuss how we will be resourcing this effort. And we could potentially combine this with other discussions that we will have in the near future. So if we agree to investigate this further, I would like to recommend moving forward with the more concrete temporary solution that Tiago is proposing for the 10.0 release. Note that this would mean that at this point, on this thread, we don’t need to agree on the specifics of any alternative proposal longer-term, we can start a different conversation thread for this. I hope this can convince people to +1 the approach as described by Tiago short term for the release, with the addition of the recipe how to revert some of the temporary changes and the promise to further evaluate longer-term alternatives. For those that are interested, I wanted to also give an indication what this proposal might mean at a high level from my point of view, which is included below. Thx, Kris [Optional reading] Alternative longer-term proposal One could subdivide the work we do in two main streams: one focused more on the runtimes, one focused more on the tooling. In general a lot of tooling can be built independently from the runtime and vice versa, where they communicate with each other through well defined formats or apis. However, once we start looking at more advanced use cases and the full end-to-end, this is where we need both tooling and runtime together. The goal is to create one release pipeline(**). The issue with cyclic dependencies between repos is imho twofold: 1) we haven’t been 100% consistent in separating runtimes and tooling this way and 2) we haven’t accommodated well for use cases where runtime and tooling needs to be combined. Note that some of these dependencies might not be build time dependencies but test and/or runtime dependencies only. As an alternative to one kie-tools monorepo that combines tooling and images and operator, I believe we can construct a pipeline where most of runtime and tooling can be built independently, but after runtime and tooling are built, we complete the build with other components/repositories, because they logically rely more on both. Examples of components that rely on both are for example be a devui extension (a quarkus extension that embeds tooling) or the devmode image (that also includes tooling features), or integration testing (where we want to test whether tooling and runtime work well together. More specifically, this would mean 1) making sure that there are well-defined boundaries between the core runtimes and core tooling so they don’t depend on each other at build time. We can decide to move components around where we think that makes sense, for example: move ui code related to devui into kie-tools (as discussed before) move kn-workflow to the operator repository as it more closely related to that 2) update the CI and release pipelines so that core runtime and tooling repositories can be built first, and are followed up by other repositories like images and operator, that could then rely on both. (*) Note that there would be other options technically to achieve this, like cutting a release branch early and performing the changes only there, but given other work is still ongoing as well, we want to minimize the cherry-picking effort. (**) Note that while the goal is to create one release pipeline, this should not necessarily mean that we can’t have smaller or optimized pipelines for CI and daily development, where the impact of changes is typically more localized. On Tue, Mar 12, 2024 at 8:45 PM Tiago Bento <tiagobe...@apache.org> wrote: > Hi everyone, > > Unfortunately, I can't do a tl;dr this time, as this matter requires a > lot of context. > > This email will take you < 20 minutes to read, according to > https://thereadtime.com/. > > As you may have followed on a separate thread > (https://lists.apache.org/thread/nknm6j641qk2c7cl621tsy3fy98tsc69), > many of us were working towards removing a circular dependency > currently present between `kogito-apps` and `kie-tools`. As we > progressed towards a solution, we kept finding the circular dependency > pop up somewhere else. I'll do a breakdown of the things we did, and > the results we had. > > Right now, even though we started the effort to move the Quarkus Dev > UI modules to `kie-tools`, we haven't been able to do it yet, as we've > been busy upgrading KIE Tools to Java 17, Maven 3.9.6, and Quarkus > 3.2.9, compatible with Kogito Runtimes 999-20240218-SNAPSHOT. This > effort was concluded this Monday, with > https://github.com/apache/incubator-kie-tools/pull/2136. > > The current scenario we have is: > > 01. incubator-kie-kogito-runtimes > |==> 02. incubator-kie-kogito-apps > C | 03. incubator-kie-kogito-examples > Y | 04. incubator-kie-kogito-images > C | 05. incubator-kie-kogito-serverless-operator > L | ========================== > E | 06. incubator-kie-sandbox-quarkus-accelerator > |==> 07. incubator-kie-tools > > > * As `kie-tools`/extended-services depends on > `kogito-apps`/jitexecutor; > * and `kogito-apps`/{sonataflow,bpmn}-quarkus-devui depend on > `kie-tools`/{many packages} > > > After moving the Quarkus Dev UIs to `kie-tools`, we would've had: > > 01. incubator-kie-kogito-runtimes > 02. incubator-kie-kogito-apps > 03. incubator-kie-kogito-examples > C |==> 04. incubator-kie-kogito-images > Y | 05. incubator-kie-kogito-serverless-operator > C | ===================== > L | 06. incubator-kie-sandbox-quarkus-accelerator > E |==> 07. incubator-kie-tools > > * As `kie-tools`/kn-plugin-workflow depends on > `kogito-images`/kogito-swf-devmode; > * and `kogito-images`/kogito-swf-devmode depends on > `kie-tools`/sonataflow-quarkus-devui > > > After moving the `kogito-swf-devmode` image to `kie-tools`, we would've > had: > > 01. incubator-kie-kogito-runtimes > 02. incubator-kie-kogito-apps > 03. incubator-kie-kogito-examples > 04. incubator-kie-kogito-images > C |==> 05. incubator-kie-kogito-serverless-operator > Y | ===================== > C | 06. incubator-kie-sandbox-quarkus-accelerator > L |==> 07. incubator-kie-tools > E > > * As `kie-tools`/kn-plugin-workflow depends on > `kogito-serverless-operator`; > * and `kogito-serverless-operator` depends on > `kie-tools`/kogito-swf-devmode > > > Clearly, we have a much bigger problem than a simple circular dependency. > > After multiple conversations with a lot of people, it's been really > hard coming up with a simple solution that makes it possible to build > Apache KIE in one shot, while preserving the way everyone is used to > contributing to the multiple repositories we have. More than that, > while making this assessment, I found more problems that, in my > perspective, block Apache KIE 10. > > In light of that difficulty, I'm coming forward with my proposal for > the Apache KIE release process, so we can use Apache's mechanisms to > have a slower-paced, in-depth debate about this really complicated > matter. > > I'll lay out my entire perspective about the current situation of our > codebase, as well as problems I can currently see. I'll start with an > analysis of the repositories and their purposes, point out some > problems that I believe are blocking our 10 release, explain my > proposal and discuss some consequences to what I'm proposing. > > Let's begin. > > > # THE APACHE KIE REPOS > > A. DROOLS OPTAPLANNER, & KOGITO (count: 11) > - incubator-kie-kogito-pipelines @ `main` > - incubator-kie-drools @ `main` > - incubator-kie-optaplanner @ `main` > - incubator-kie-optaplanner-quickstarts @ `main` > - incubator-kie-kogito-runtimes @ `main` > - incubator-kie-kogito-apps @ `main` > - incubator-kie-kogito-examples @ `main` > - incubator-kie-kogito-images @ `main` > - incubator-kie-kogito-serverless-operator @ `main` > - incubator-kie-kogito-docs @ `main` > - incubator-kie-docs @ `main-kogito` > > B. TOOLS (count: 2) > - incubator-kie-sandbox-quarkus-accelerator @ `0.0.0` > - incubator-kie-tools @ `main` > > C. BENCHMARKS (count: 2) > - incubator-kie-kogito-benchmarks @ `main` > - incubator-kie-benchmarks @ `main` > > D. ARCHIVED (count: 1) > - incubator-kie-kogito-operator > > E. "NON-CODE" (count: 5) > - incubator-kie-issues @ `main` > (Issues only, README should be updated @ `main`. Same for GitHub > Actions workflows.) > - incubator-kie-kogito-website @ `main` > (The Kogito website. Develop & deploy at the `main` branch.) > - incubator-kie-website @ `main` > (The KIE website. Develop @ `main`. Push @ `deploy` to update the > website.) > - incubator-kie-kogito-online @ `gh-pages` > (GitHub pages used to host sandbox.kie.org and KIE Tools' Chrome > Extension assets.) > - incubator-kie-kogito-online-staging @ `main` > (Same as above, but for manual sanity checks during the staging > phase of a release.) > > TOTAL (count: 21) > > I grouped the repositories by category, and listed them in a > topological order. Keep in mind that when flattening out a tree, there > are multiple possibilities. For example, OptaPlanner could've been > placed in any position after Drools. > > Category A repos are what I've been referring to as `drools` and > `kogito-*` stream. Of course OptaPlanner is inside that stream, as the > way these repositories reference each other are through Maven > SNAPSHOTs. More specifically, the 999-SNAPSHOT version. This mechanism > is well-known to the team, and although flawed for intra-day builds > and disruptive for people in many different time zones, it is already > very comfortable for everyone to work with, I assume. > > Contributions made to Category A have some dedicated pipelines, which > are, at least to some extent, able to build cross-repo PRs together > and verify that the codebase will continue working as expected after > they're all merged. From what I could gather, there are some > "sub-streams" currently configured for cross-repo PRs. > > - kogito-pipelines > - drools, kogito-runtimes, kogito-apps, and kogito-examples > - optaplanner, and optaplanner-quickstarts > - kogito-images, and kogito-serverless-operator > - kogito-docs > - kie-docs > > This means that sending cross-repo PRs to any combination of repos > that are not part of the same "sub-stream" cannot be verified before > merging, making our contribution model dependent on individual > contributors building stuff on their machines to verify that it works. > > I based this analysis on > > https://github.com/apache/incubator-kie-kogito-pipelines/blob/main/.ci/project-dependencies.yaml > , > > https://github.com/apache/incubator-kie-optaplanner/blob/main/.ci/buildchain-project-dependencies.yaml > , > and > https://github.com/apache/incubator-kie-kogito-pipelines/blob/main/.ci/jenkins/config/branch.yaml > . > Note that I'm not that familiar with these pipelines, so please > someone correct me if I'm wrong. > > Category B repos are what I've been referring to as `kie-tools` > stream. The first repo there is a template repository that is used by > people starting projects from scratch on KIE Sandbox, similar to a > Maven archetype, if you will. The other one is the KIE Tools monorepo, > a polyglot monorepo with `pnpm` as its build system. Currently, KIE > Tools hosts Java libraries and apps, TypeScript libraries and apps, Go > apps, Docker images, and Helm charts. The `kie-tools` monorepo is > configured to work with sparse checkouts and can do partial builds. > Category B repos refer to Category A repos through timestamped > SNAPSHOTs. This is a new mechanism we recently introduced that will > build and publish immutable, persistent artifacts under a version > following the 999-YYYYMMDD-SNAPSHOT format, published weekly every > Sunday night. Timestamped SNAPSHOTs are an evolution to the Kogito > releases, as we're now targeting one release for all of Apache KIE, so > we can't have Kogito releases anymore. > > An important note here is that Category B repositories have been > historically kept out of any automations we used to have, way back > when Kogito started and we had the Business Central (a.k.a. v7) stream > still going on. For this reason, Category B projects have developed > their own automations, based on GitHub Actions. Category B repos have > always depended on Category A repos using fixed versions. If Category > B repos have had adopted mutable SNAPSHOTs, breaking changes on > Category A repositories would've had the potential to break Category B > silently, leaving Category B with a broken development stream, and > introducing unpleasant surprises for maintainers of Category B repos, > as historically Category A contributors were not familiar with > Category B repos. > > Contributions made to Category B repos go through a GitHub Actions > workflow that builds the relevant part of the `kie-tools` monorepo for > the changes introduced. Changes made to the pipeline itself are also > picked up as part of PRs, allowing us to do things like atomically > bumping the Node.js version, for example. More importantly, it allows > us to upgrade the repository to a new timestamped SNAPSHOT together > with the changes necessary to make it stay green. > > This setup, however, makes it impossible to have cross-repo PRs > involving Category A and Category B simultaneously, with the current > automations we have. > > Category C repos are kind of floating around, and I'm not sure if > there's much activity going on there. Regardless, as they're part of > Apache KIE, they will be part of our release, so I listed them for us > to take them into consideration too. > > Category D is self explanatory. There's only one repo that has already > been marked for being archived. > > Category E are repos that do not host code directly, and are either > organizational entities, or host websites, that currently are not part > of any pipelines we have. > > This lack of unification between Category A and Category B is, IMHO, > what allowed us to introduce the infamous circular dependency between > `kie-tools` and `kogito-apps`, which we now can describe as a circular > dependency between Category A and Category B. The way I see it, if we > had a single pipeline, building everything from `drools` to > `kie-tools`, such flaws would've never been introduced, and we > wouldn't be having this huge problem in our hands right now. > > My proposal for the Apache KIE release process sees this lack of > unification as a central problem, not only for this release in > particular, but for the community as a whole. It is my belief that we > are all under the same roof, and that no contribution should be > allowed to break any part of our codebase. With the increasing volume > of code, and hopefully number of contributors too, we cannot keep > counting on "common sense" to avoid breaking things. We're all humans > after all, and it is our job to have mechanisms in place to prevent us > from unwillingly making mistakes. Especially when these mistakes > impact on parts of the codebase that we, individually, probably can't > fix. > > > # THE PROBLEMS WE HAVE RIGHT NOW > > P1. Quarkus Dev UIs @ `kogito-apps` depending on kiegroup's KIE Tools > `0.32.0`. > See: > - > https://github.com/search?q=repo%3Akiegroup%2Fkogito-apps+path%3Apackage.json+kie-tools&type=code > > > P2. PR open for Kogito SWF images @ `kogito-images` depending on > kiegroup's KIE Tools `0.32.0`. > See: > - > https://github.com/apache/incubator-kie-tools/tree/main/packages/sonataflow-deployment-webapp > > > P3. DashBuilder @ `kie-tools` depending on kiegroup's `lienzo` and > `kie-soup` artifacts at version `7.59.0.Final`. > See: > - > https://github.com/apache/incubator-kie-tools/blob/main/packages/dashbuilder/pom.xml#L64 > - > https://github.com/search?q=repo%3Aapache%2Fincubator-kie-tools+path%3Apackages%2Fdashbuilder+%24%7Bversion.org.kie%7D&type=code > > > P4. Multiple packages @ `kogito-apps` depending on kiegroup's > Explainability `1.22.1.Final`. > * This module was removed from the KIE codebase here: > > https://github.com/apache/incubator-kie-kogito-apps/commit/bbb22c06d37e77b97aae6496d74abe43a8cfc965 > and now lives on > https://github.com/trustyai-explainability/trustyai-explainability, > under a different GAV. > * This new repo depends on Kogito and OptaPlanner, pointing to older > versions. > See: > - > https://github.com/search?q=repo%3Aapache%2Fincubator-kie-kogito-apps+%3Eexplainability-core%3C&type=code > - > https://github.com/trustyai-explainability/trustyai-explainability/blob/main/pom.xml#L52-L53 > > > P5. `incubator-kie-sandbox-quarkus-accelerator` depending on Kogito > `1.32.0.Final` and Quarkus `2.15.3.Final`. > See: > - > https://github.com/apache/incubator-kie-sandbox-quarkus-accelerator/blob/0.0.0/pom.xml#L32-L38 > > > P6. Category C repos are out of date and not part of the Category A > CI/Release pipelines. > * incubator-kie-kogito-benchmarks: (Current version is `2.0-SNAPSHOT`, > depending on Kogito without a specific version, only by using > `http://localhost:8080`) > * incubator-kie-benchmarks: (Current version is `1.0-SNAPSHOT`, > pointing to Drools 999-SNAPSHOT and OptaPlanner `8.45.0-SNAPSHOT`) > > > P7. `kie-tools`/packages/kn-plugin-workflow has its E2E disabled after > upgrading to 999-20240218-SNAPSHOT. > > > In my perspective, P1 and P2 have the same solution, as they both > suffer from the circular dependency between Category A and Category B. > As Category A and Category B are both streams that have been really > active, I see this as a blocker, as there are contributions that > cannot be done, given that Category A depends on Category B with a > dephasing of 1 release. > > P3 and P4, although not ideal, can be understood as technical debt. > Depending on unmaintained projects is something we'll always be > susceptible to, given time. > > P5 and P6 are easily fixable, as it's just a matter of making them > part of the play. > > P7 is an isolated problem that won't impact the structure or anything > that we're talking about here, but it is a regression we introduced > recently. > > Assuming P3 and P4 can be ignored for Apache KIE 10, and that P5, P6, > and P7 have easy fixes, the only problems left to discuss are P1 and > P2, which can't be done without a proper proposal. > > > # THE PROPOSAL > > I'll try to be very meticulous here, since from my experience, any > little miscalculation can lead to our release not working out in the > end. To try and avoid that as much as possible, and make everything we > can to have a successful Apache KIE 10 release, bear with me. I'll lay > out a timeline of events that need to happen in order for our release > to be published, with all artifacts ending up in the right places, but > first, we need to solve problems P1 and P2. > > As you saw at the beginning of this email, all the attempts we made > left us with the circular dependency showing up at a different place, > but something all these places have in common is that they're all > after kogito-apps, and before to Category B. > > The first part of my proposal is the following: > > S1. We keep the original plan of moving the Quarkus Dev UIs from > `kogito-apps` to `kie-tools`, together with Management and Task > consoles from `kogito-images` to `kie-tools`. > S2. We move the `kogito-swf-devmode` and `kogito-swf-builder` images > from `kogito-images` to `kie-tools` too. > S3. We move the entire `kogito-serverless-operator` repo inside a new > package on `kie-tools`, keeping Git history. > > Solutions S1, S2, and S3 together solve problems P1 and P2. Of course > the rest of https://github.com/apache/incubator-kie-issues/issues/967 > would still be done too. > > This doesn't come without consequences, of course, as the > `kogito-swf-devmode` and `kogito-swf-builder` images, and the > `kogito-serverless-operator` would be moving from Category A to > Category B. This move would make them have to reference Category A > repos through timestamped SNAPSHOTs. Since `kogito-images` and > `kogito-serverless-operator` are already their own "sub-stream" inside > Category A, though, contributions made in a cross-repo fashion to this > "sub-stream" will continue being possible, now via a single PR to > `kie-tools`. Cross-repo PRs between Category A and Category B will > continue not being possible, and a 1-week delay between merging > something on Category A and using it on Category B will still happen. > > It's worth mentioning that `kie-tools`, however, does allow for sparse > checkouts and partial builds, so working with a subset of the monorepo > is possible and encouraged. Making changes only to > `packages/kn-plugin-workflow`, for example, will have the PR checks > run in < 10 minutes, as you can see here: > > https://github.com/apache/incubator-kie-tools/actions/runs/8237244382/job/22525511722?pr=2136 > . > We're not compromising when running partial builds too. We know that > the entire repo will continue working even after only building a small > subset of the changes. Doing partial or full builds is automatically > determined by the changes of a PR. > > Keep in mind that, even though I'm proposing we move a bunch of > additional stuff into `kie-tools`, I see this as a TEMPORARY solution > for our codebase. `kie-tools` would host some additional stuff > TEMPORARILY so that we can release and continue moving forward. > > As I mentioned on other places, `kie-tools` became a polyglot monorepo > out of necessity, and although I'm really proud of what we achieved > there so far, I don't think `kie-tools` has a setup that is suitable > for all the different nuances that compose our community. I'm well > aware that a polyglot monorepo that does not follow widespread > conventions will scare some people away, and as much as we've tried to > make build instructions clear, we can't always get past the prejudice > some people have towards the "front-end" ecosystem. > > With all that said, I keep thinking this is the best course of action > for us right now. We keep most of our stuff unchanged, we unblock the > release, and we have a working setup that will suit us well while we > discuss and reach a conclusion regarding the future of our codebase > structure. > > Let me paint a quick picture here of what our code base would look > like, repository-wise, if my proposal is accepted: > > CATEGORY REPO > ===================== > A incubator-kie-kogito-pipelines > A incubator-kie-drools > A incubator-kie-optaplanner > A incubator-kie-optaplanner-quickstarts > A incubator-kie-kogito-runtimes > A incubator-kie-kogito-apps > A incubator-kie-kogito-examples > A incubator-kie-kogito-images > A incubator-kie-kogito-docs > A incubator-kie-kogito-benchmarks > A incubator-kie-docs > A incubator-kie-benchmarks > ===================== > B incubator-kie-sandbox-quarkus-accelerator > B incubator-kie-tools > ===================== > D incubator-kie-kogito-operator > ===================== > E incubator-kie-issues > E incubator-kie-kogito-website > E incubator-kie-website > E incubator-kie-kogito-online > E incubator-kie-kogito-online-staging > ===================== > > * Category C becomes part of Category A, and > `kogito-serverless-operator` moves entirely inside `kie-tools`. > * With `kogito-swf-{builder,devmode}` images and > `kogito-serverless-operator` inside `kie-tools`, there are no cycles > anymore, as inside `kie-tools`, we can granularly build: > 1. packages/sonataflow-deployment-webapp > 2. packages/sonataflow-quarkus-devui > 3. packages/sonataflow-images (containing `kogito-swf-builder` and > `kogito-swf-devmode`) > 4. packages/sonataflow-operator (contents from > `kogito-serverless-operator`) > 5. packages/kn-plugin-sonataflow (`packages/kn-plugin-workflow`, > but renamed) > > The second part of the proposal is the release process itself, > assuming the structure above is what we have. > > Here it is: > > 1. Define a timestamped SNAPSHOT to be used as cutting point for > Category A repos. > 2. Update Category B repos to point to this timestamped SNAPSHOT, and > verify that everything is working. > 3. At this point, with everything working, we can branch out to > `10.0.x`. Category A from the timestamped SNAPSHOT tag, and Category B > from `main`. > 4. All Category A and Category B repos update their versions to > 10.0.0, in their `10.0.x` branches. > 5. Update Category B repos to point to Category A repos using the > 10.0.0 version. > 6. At this point, we can vote on the release based on the `10.0.x` > branches, given we don't expect any code changes anymore. > 7. After voting passes, we're good to start the release process. > 8. Category A repos follow their manual/automated release process, > pointing to the `10.0.x` branch. Tags pushed to Git, and built > artifacts pushed to their registries. > 9. We wait a little bit for Category A artifacts to be propagated on > registries. ~1 day. > 10. Category B repos follow their manual/automated release process, > pointing to the `10.0.x` branch. Tags pushed to Git, and built > artifacts pushed to their registries. > 11. Category D repos are ignored. > 12. Category E repos can be manually tagged with 10.0.0 from their > default branches. > > More needs to be discussed if we're planning to maintain multiple > release streams in parallel, but I guess it can wait for after Apache > KIE 10. > > Thank you for reading, and I'm looking forward to hearing back from > everyone. > > Of course, alternative solutions are possible. This email, however, > summarizes my view of how we should attack the problem, considering > disruption, required effort, the release process itself, and history. > Feel free to propose alternatives. This is not a voting thread. > > Regards, > > Tiago Bento > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@kie.apache.org > For additional commands, e-mail: dev-h...@kie.apache.org > >