+1 for the direction and the proposed PR. I really like the simplicity of it, and I'm looking forward to the integration with tests. I think that the direction is good from maintainers and users' perspective.
Looking forward to more feedback too. Regards, Przemek ________________________________ From: Buğra Öztürk <[email protected]> Sent: 28 April 2026 21:24 To: [email protected] <[email protected]> Subject: Re: [DISCUSS] Helm Chart Refurbish: Kustomize Direction and Path to 2.0 Nice, thanks Jason for the previous context links! I think both look like really plausible candidates Thanks Jarek! Thanks Jens! Disclaimer is a valid point. Although I added the phrase (`..the chart as a guide for users, **not** as part of the released chart artifact.```) to the README, we may need a disclaimer so that it is defined officially that it will be `as is`. I will add something to the draft so we can also talk over a concrete one On Mon, Apr 27, 2026 at 7:57 PM Jens Scheffler <[email protected]> wrote: > Thanks Bugra for dropping ghe structure draft! > > I like it very much as it is lean. > > I do not have extensive experience with Kustomize so would be great to > hear from others who might have more experience. But also all is "code", > so if we see over time that structure does not suit then still we are > able to extend and change. > > Added small (non blocking) comments. > > Do we need to add a disclaimer that it is "not release" but just proided > "as is"? > > Jens > > On 27.04.26 13:24, Jarek Potiuk wrote: > > Hi Bugra, > > > > I really like the PoC and its direction. While I was involved in defining > > the broad approach and created an early Claude-coded PoC, that version > was > > far too complex, opinionated, and large. > > > > Seeing this simple PR—which combines minimal code with architectural and > > user documentation—makes this discussion much more meaningful and > informed. > > This perfectly demonstrates the power of a > > "documentation-in-parallel-to-code" approach. > > > > Best, > > Jarek > > > > On Mon, Apr 27, 2026 at 9:29 AM Zhe-You(Jason) Liu <[email protected]> > > wrote: > > > >> Hi Bugra, > >> > >> Thanks for raising the Kustomize discussion. I haven't gone through the > doc > >> thoroughly yet, but just FYI, here is some context I have regarding the > >> Kustomize approach. This might be helpful in coming up with a final > >> structure that fits all the use cases we need to support while ensuring > >> good long-term maintenance. > >> > >> For example: > >> - Add optional OTel service to the Airflow Helm Chart #64902 [1] > >> - Helm chart support for periodic API server rollout restarts on > Kubernetes > >> #61432 [2] > >> > >> Additionally, there is a Slack thread discussing the Kustomize direction > >> [3]. > >> > >> [1] > https://github.com/apache/airflow/pull/64902#issuecomment-4206639363 > >> [2] > https://github.com/apache/airflow/pull/61636#issuecomment-3881992323 > >> [3] > >> https://apache-airflow.slack.com/archives/C027H098M1C/p1770794021001679 > >> > >> Best, > >> Jason > >> > >> On Mon, Apr 27, 2026 at 8:16 AM Buğra Öztürk <[email protected]> > >> wrote: > >> > >>> Hi all, > >>> > >>> I have started working on the PoC for the Kustomize direction as > >> mentioned > >>> in the thread for KEDA. > >>> > >>> Here is what I am thinking for the approach to make this stable and > >> faster > >>> for further iterations. It is to align with the fundamentals before > >>> building further. Smaller increments should make reviews easier and > allow > >>> for quicker course correction. Once the foundation is in place, the > >>> remaining work should move faster. > >>> > >>> * Share the directory structure in this first PoC example (not fully > >> tested > >>> yet), with CI/pre-commit checks focusing only on validating the agreed > >>> structure > >>> > >>> * Collect feedback, review, and merge the shared PR > >>> > >>> * Propose and build a smoke test on top of the KEDA overlay in a > separate > >>> new PR > >>> > >>> * Collect feedback, review, and merge the smoke test PR > >>> > >>> * Test locally to check if smoke tests match > >>> > >>> * Move KEDA overlay to testing in a new PR with the introduction of a > >>> deprecation warning > >>> > >>> PR: https://github.com/apache/airflow/pull/65897 > >>> > >>> Thoughts and early feedback very welcome. > >>> > >>> Are we going to go over these in every overlay addition? > >>> Short answer, no. > >>> Long answer, this is early maturity frictions and making step-by-step > >> will > >>> make new overlay additions without too much hassle. I hope that an > >> agreed, > >>> tested, documented approach will make the next additions in one go in a > >>> single PR :) > >>> > >>> > >>> Kind regards, > >>> Bugra Ozturk > >>> > >>> On Sat, Apr 25, 2026 at 5:37 PM Buğra Öztürk <[email protected]> > >>> wrote: > >>> > >>>> Sorry for the formatting of the directory structure! In the mail app, > >> it > >>>> looked fine. You can find that specifically in Google Docs as well > >>>> > >>>> > >> > https://docs.google.com/document/d/1bZsyrG5kjsYd2rJRiN3kR613lO6JPEBd4ItsySneOMw/edit?tab=t.cv476feyrxmf > >>>> On Sat, Apr 25, 2026 at 5:31 PM Buğra Öztürk <[email protected] > > > >>>> wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> We have been working through a Helm chart refurbishment effort over > >> the > >>>>> past few months. The goal is to keep 1.2x stable for existing users > >>> while > >>>>> preparing a cleaner next major release. I would like to share where > we > >>> have > >>>>> landed and open it up for feedback before going further. > >>>>> > >>>>> *Branching strategy* > >>>>> > >>>>> We created chart/v1-2x-test, mirroring how v3-1-test works for > Airflow > >>>>> itself. > >>>>> > >>>>> - > >>>>> > >>>>> chart/v1-2x-test is the maintenance line. Bug fixes and stability > >>>>> work for 1.2x releases land here. > >>>>> - > >>>>> > >>>>> main is for cleanup, deprecations, and preparation toward 2.0. > >>>>> > >>>>> The split was deliberate. We wanted to give existing 1.x users a > >> smooth > >>>>> transition path without holding back the 2.0 work, and the same the > >>> other > >>>>> way around. 2.0 is intended as a real refurbish rather than an > >>> incremental > >>>>> bump. It will carry a fair number of breaking changes, but the upside > >> is > >>>>> that it gives users a clean starting point with a chart fully > designed > >>>>> around Airflow 3 and what comes after, instead of one carrying years > >> of > >>>>> accumulated assumptions from the 1.x line. Existing users on 1.2x are > >>> not > >>>>> forced into the move, which the maintenance branch is keeping > shipping > >>> for > >>>>> them, but anyone starting fresh or willing to migrate gets a much > >>> simpler > >>>>> chart to work with. > >>>>> > >>>>> We have already cut and released 1.21.0 from chart/v1-2x-test, so the > >>>>> model is in place rather than hypothetical. The release went through > >>>>> cleanly and gave us the separation we were after, which is part of > the > >>>>> reason the proposal feels concrete enough to bring here. > >>>>> > >>>>> *Kustomize direction* > >>>>> > >>>>> A recurring theme in our discussions has been that the chart carries > a > >>>>> fair amount of components that are not Airflow-native. Kerberos, > >>>>> Elasticsearch logging, gitSync, and PostgreSQL are good examples. > They > >>> make > >>>>> the chart heavier than it needs to be and pull us toward maintaining > >>> things > >>>>> that already have external owners. > >>>>> > >>>>> The proposal is to express these as Kustomize overlays that sit > >>> alongside > >>>>> the chart as a guide for users, not as released chart artifacts. > >>>>> > >>>>> *Confirmed for Kustomize* > >>>>> > >>>>> - > >>>>> > >>>>> Kerberos: Authentication variant, environment-specific, sidecar > >>>>> injection > >>>>> - > >>>>> > >>>>> gitSync: DAG delivery mechanism, orthogonal to Airflow runtime > >>>>> - > >>>>> > >>>>> Elasticsearch: External logging backend, not Airflow-native > >>>>> - > >>>>> > >>>>> PostgreSQL: Can be expressed as plain Kubernetes resources > >>>>> > >>>>> PgBouncer and StatsD are also candidates but we want to investigate > >> them > >>>>> further before committing. They will not be in the first round of > >>> overlays. > >>>>> *Structure* > >>>>> > >>>>> Overlays live in the repository but are not part of the chart release > >>>>> artifact. Each overlay has a kustomization.yaml, the resources it > >>> produces, > >>>>> and a STATUS file marking whether it is verified in CI or a starting > >>> point > >>>>> that users can extend. > >>>>> > >>>>> A rough sketch of how it would look in the repo: > >>>>> > >>>>> > >>>>> > >>>>> ``` > >>>>> chart/ > >>>>> > >>>>> > >>>>> > >>>>> kustomize-overlays/ > >>>>> > >>>>> > >>>>> README.rst > >>>>> CONTRIBUTING.rst > >>>>> keda/ > >>>>> > >>>>> > >>>>> > >>>>> kustomization.yaml > >>>>> scaledobject.yaml > >>>>> > >>>>> > >>>>> > >>>>> STATUS > >>>>> > >>>>> > >>>>> > >>>>> kerberos/ > >>>>> kustomization.yaml > >>>>> scheduler-sidecar-patch.yaml > >>>>> STATUS > >>>>> > >>>>> ``` > >>>>> > >>>>> > >>>>> > >>>>> We will start with a PoC before agreeing on the broader rollout. HPA > >> or > >>>>> KEDA covers the standalone addition pattern to go first or second. > >>> Kerberos > >>>>> covers the post-render patch pattern and becomes the template for any > >>>>> future sidecar injection use case. We are putting together a first > PoC > >>> now > >>>>> and will share it in this thread once it is in a shape worth looking > >>> at, so > >>>>> the discussion has something concrete to sit alongside the criteria > >>> below. > >>>>> *Lifecycle* > >>>>> > >>>>> The lifecycle mirrors how providers work, just on a smaller scale. > >>>>> > >>>>> - > >>>>> > >>>>> A new overlay is proposed via a PR and lands with STATUS: > >> not-tested. > >>>>> - > >>>>> > >>>>> The contributor follows up with a test at > >>>>> chart/tests/kustomize/test_.py and flips STATUS to tested, either > >> in > >>>>> the same PR or a focused follow-up. Equally, there can be smoke > >> test > >>>>> on CI to test the flow of Kustomize overlays, which can be a > >>> technical > >>>>> detail of the process flow. > >>>>> - > >>>>> > >>>>> An overlay is deprecated by setting deprecated: true in STATUS > >> along > >>>>> with a short message pointing to the replacement. > >>>>> - > >>>>> > >>>>> Deprecated overlays stay around for one major chart version > before > >>>>> they are removed, so users always have a window to migrate. > >>>>> > >>>>> CONTRIBUTING.rst in the overlays directory is the authoritative > >>> reference > >>>>> for all of this, criteria, the exception process, status conventions, > >>> and > >>>>> the migration guide pattern live there together. > >>>>> > >>>>> *Criteria for chart vs Kustomize* > >>>>> > >>>>> The criteria will live at chart/kustomize-overlays/CONTRIBUTING.rst. > >>>>> > >>>>> Belongs in the chart (all must be true): > >>>>> > >>>>> - > >>>>> > >>>>> Required to run Airflow (scheduler, API server, dag-processor, > >>>>> triggerer, workers) > >>>>> - > >>>>> > >>>>> Removing it requires changes to Airflow's own configuration > >>>>> - > >>>>> > >>>>> No external owner > >>>>> > >>>>> Belongs in Kustomize (any may be true): > >>>>> > >>>>> - > >>>>> > >>>>> Can be expressed as a standalone Kubernetes resource without > >>>>> modifying chart-rendered resources > >>>>> - > >>>>> > >>>>> Environment-specific (authentication schemes, logging backends, > >>>>> autoscaling controllers) > >>>>> - > >>>>> > >>>>> Has an external owner (KEDA, Elasticsearch, any PostgreSQL > >>>>> distribution) > >>>>> - > >>>>> > >>>>> Requires CRDs that the chart does not install > >>>>> > >>>>> One invariant we want to keep is that the chart never removes a > >>> component > >>>>> without a working overlay already in place. Users should always have > a > >>>>> migration path before anything disappears. > >>>>> > >>>>> *Thoughts welcome* > >>>>> > >>>>> The branching split is in place because we wanted the transition to > >> 2.0 > >>>>> to be smooth for users, with 1.2x continuing to ship in parallel. > >>> Sharing > >>>>> it here so the rest of the proposal sits in the right context. > >>>>> > >>>>> What I would love to hear thoughts on: > >>>>> > >>>>> - > >>>>> > >>>>> Does the chart vs Kustomize criteria hold up against the > >> deployments > >>>>> you have run? Anything that feels off, missing, or too strict. > >>>>> - > >>>>> > >>>>> Anything in the confirmed component list you would push back on, > or > >>>>> anything you think should be added. > >>>>> > >>>>> If you would rather leave longer notes on the Confluence page or the > >>>>> Google Doc we have been working from, those are equally welcome. > Links > >>>>> below. > >>>>> > >>>>> *References* > >>>>> > >>>>> - > >>>>> > >>>>> Confluence: > >>>>> > https://cwiki.apache.org/confluence/display/AIRFLOW/Helm+Refurbish > >>>>> - > >>>>> > >>>>> Discussion notes (Google Doc): > >>>>> > >> > https://docs.google.com/document/d/1bZsyrG5kjsYd2rJRiN3kR613lO6JPEBd4ItsySneOMw/edit?usp=sharing > >>>>> - > >>>>> > >>>>> Umbrella issue: https://github.com/apache/airflow/issues/64037 > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Bugra Ozturk > >>>>> > >>>>> Kind regards, > >>>>> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
