Thanks Bugra for dropping ghe structure draft!

I like it very much as it is lean.

I do not have extensive experience with Kustomize so would be great to hear from others who might have more experience. But also all is "code", so if we see over time that structure does not suit then still we are able to extend and change.

Added small (non blocking) comments.

Do we need to add a disclaimer that it is "not release" but just proided "as is"?

Jens

On 27.04.26 13:24, Jarek Potiuk wrote:
Hi Bugra,

I really like the PoC and its direction. While I was involved in defining
the broad approach and created an early Claude-coded PoC, that version was
far too complex, opinionated, and large.

Seeing this simple PR—which combines minimal code with architectural and
user documentation—makes this discussion much more meaningful and informed.
This perfectly demonstrates the power of a
"documentation-in-parallel-to-code" approach.

Best,
Jarek

On Mon, Apr 27, 2026 at 9:29 AM Zhe-You(Jason) Liu <[email protected]>
wrote:

Hi Bugra,

Thanks for raising the Kustomize discussion. I haven't gone through the doc
thoroughly yet, but just FYI, here is some context I have regarding the
Kustomize approach. This might be helpful in coming up with a final
structure that fits all the use cases we need to support while ensuring
good long-term maintenance.

For example:
- Add optional OTel service to the Airflow Helm Chart #64902 [1]
- Helm chart support for periodic API server rollout restarts on Kubernetes
#61432 [2]

Additionally, there is a Slack thread discussing the Kustomize direction
[3].

[1] https://github.com/apache/airflow/pull/64902#issuecomment-4206639363
[2] https://github.com/apache/airflow/pull/61636#issuecomment-3881992323
[3]
https://apache-airflow.slack.com/archives/C027H098M1C/p1770794021001679

Best,
Jason

On Mon, Apr 27, 2026 at 8:16 AM Buğra Öztürk <[email protected]>
wrote:

  Hi all,

I have started working on the PoC for the Kustomize direction as
mentioned
in the thread for KEDA.

Here is what I am thinking for the approach to make this stable and
faster
for further iterations. It is to align with the fundamentals before
building further. Smaller increments should make reviews easier and allow
for quicker course correction. Once the foundation is in place, the
remaining work should move faster.

* Share the directory structure in this first PoC example (not fully
tested
yet), with CI/pre-commit checks focusing only on validating the agreed
structure

* Collect feedback, review, and merge the shared PR

* Propose and build a smoke test on top of the KEDA overlay in a separate
new PR

*  Collect feedback, review, and merge the smoke test PR

* Test locally to check if smoke tests match

* Move KEDA overlay to testing in a new PR with the introduction of a
deprecation warning

PR: https://github.com/apache/airflow/pull/65897

Thoughts and early feedback very welcome.

Are we going to go over these in every overlay addition?
Short answer, no.
Long answer, this is early maturity frictions and making step-by-step
will
make new overlay additions without too much hassle. I hope that an
agreed,
tested, documented approach will make the next additions in one go in a
single PR :)


Kind regards,
Bugra Ozturk

On Sat, Apr 25, 2026 at 5:37 PM Buğra Öztürk <[email protected]>
wrote:

Sorry for the formatting of the directory structure! In the mail app,
it
looked fine. You can find that specifically in Google Docs as well


https://docs.google.com/document/d/1bZsyrG5kjsYd2rJRiN3kR613lO6JPEBd4ItsySneOMw/edit?tab=t.cv476feyrxmf
On Sat, Apr 25, 2026 at 5:31 PM Buğra Öztürk <[email protected]>
wrote:

Hi all,

We have been working through a Helm chart refurbishment effort over
the
past few months. The goal is to keep 1.2x stable for existing users
while
preparing a cleaner next major release. I would like to share where we
have
landed and open it up for feedback before going further.

*Branching strategy*

We created chart/v1-2x-test, mirroring how v3-1-test works for Airflow
itself.

    -

    chart/v1-2x-test is the maintenance line. Bug fixes and stability
    work for 1.2x releases land here.
    -

    main is for cleanup, deprecations, and preparation toward 2.0.

The split was deliberate. We wanted to give existing 1.x users a
smooth
transition path without holding back the 2.0 work, and the same the
other
way around. 2.0 is intended as a real refurbish rather than an
incremental
bump. It will carry a fair number of breaking changes, but the upside
is
that it gives users a clean starting point with a chart fully designed
around Airflow 3 and what comes after, instead of one carrying years
of
accumulated assumptions from the 1.x line. Existing users on 1.2x are
not
forced into the move, which the maintenance branch is keeping shipping
for
them, but anyone starting fresh or willing to migrate gets a much
simpler
chart to work with.

We have already cut and released 1.21.0 from chart/v1-2x-test, so the
model is in place rather than hypothetical. The release went through
cleanly and gave us the separation we were after, which is part of the
reason the proposal feels concrete enough to bring here.

*Kustomize direction*

A recurring theme in our discussions has been that the chart carries a
fair amount of components that are not Airflow-native. Kerberos,
Elasticsearch logging, gitSync, and PostgreSQL are good examples. They
make
the chart heavier than it needs to be and pull us toward maintaining
things
that already have external owners.

The proposal is to express these as Kustomize overlays that sit
alongside
the chart as a guide for users, not as released chart artifacts.

*Confirmed for Kustomize*

    -

    Kerberos: Authentication variant, environment-specific, sidecar
    injection
    -

    gitSync: DAG delivery mechanism, orthogonal to Airflow runtime
    -

    Elasticsearch: External logging backend, not Airflow-native
    -

    PostgreSQL: Can be expressed as plain Kubernetes resources

PgBouncer and StatsD are also candidates but we want to investigate
them
further before committing. They will not be in the first round of
overlays.
*Structure*

Overlays live in the repository but are not part of the chart release
artifact. Each overlay has a kustomization.yaml, the resources it
produces,
and a STATUS file marking whether it is verified in CI or a starting
point
that users can extend.

A rough sketch of how it would look in the repo:



  ```
   chart/



     kustomize-overlays/


       README.rst
       CONTRIBUTING.rst
       keda/



         kustomization.yaml
         scaledobject.yaml



         STATUS



       kerberos/
         kustomization.yaml
         scheduler-sidecar-patch.yaml
         STATUS

```



We will start with a PoC before agreeing on the broader rollout. HPA
or
KEDA covers the standalone addition pattern to go first or second.
Kerberos
covers the post-render patch pattern and becomes the template for any
future sidecar injection use case. We are putting together a first PoC
now
and will share it in this thread once it is in a shape worth looking
at, so
the discussion has something concrete to sit alongside the criteria
below.
*Lifecycle*

The lifecycle mirrors how providers work, just on a smaller scale.

    -

    A new overlay is proposed via a PR and lands with STATUS:
not-tested.
    -

    The contributor follows up with a test at
    chart/tests/kustomize/test_.py and flips STATUS to tested, either
in
    the same PR or a focused follow-up. Equally, there can be smoke
test
    on CI to test the flow of Kustomize overlays, which can be a
technical
    detail of the process flow.
    -

    An overlay is deprecated by setting deprecated: true in STATUS
along
    with a short message pointing to the replacement.
    -

    Deprecated overlays stay around for one major chart version before
    they are removed, so users always have a window to migrate.

CONTRIBUTING.rst in the overlays directory is the authoritative
reference
for all of this, criteria, the exception process, status conventions,
and
the migration guide pattern live there together.

*Criteria for chart vs Kustomize*

The criteria will live at chart/kustomize-overlays/CONTRIBUTING.rst.

Belongs in the chart (all must be true):

    -

    Required to run Airflow (scheduler, API server, dag-processor,
    triggerer, workers)
    -

    Removing it requires changes to Airflow's own configuration
    -

    No external owner

Belongs in Kustomize (any may be true):

    -

    Can be expressed as a standalone Kubernetes resource without
    modifying chart-rendered resources
    -

    Environment-specific (authentication schemes, logging backends,
    autoscaling controllers)
    -

    Has an external owner (KEDA, Elasticsearch, any PostgreSQL
    distribution)
    -

    Requires CRDs that the chart does not install

One invariant we want to keep is that the chart never removes a
component
without a working overlay already in place. Users should always have a
migration path before anything disappears.

*Thoughts welcome*

The branching split is in place because we wanted the transition to
2.0
to be smooth for users, with 1.2x continuing to ship in parallel.
Sharing
it here so the rest of the proposal sits in the right context.

What I would love to hear thoughts on:

    -

    Does the chart vs Kustomize criteria hold up against the
deployments
    you have run? Anything that feels off, missing, or too strict.
    -

    Anything in the confirmed component list you would push back on, or
    anything you think should be added.

If you would rather leave longer notes on the Confluence page or the
Google Doc we have been working from, those are equally welcome. Links
below.

*References*

    -

    Confluence:
    https://cwiki.apache.org/confluence/display/AIRFLOW/Helm+Refurbish
    -

    Discussion notes (Google Doc):

https://docs.google.com/document/d/1bZsyrG5kjsYd2rJRiN3kR613lO6JPEBd4ItsySneOMw/edit?usp=sharing
    -

    Umbrella issue: https://github.com/apache/airflow/issues/64037

Thanks,

Bugra Ozturk

Kind regards,


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to