gerkElznik opened a new pull request, #1126:
URL: https://github.com/apache/flink-kubernetes-operator/pull/1126
## What is the purpose of the change
The Helm chart advertises `defaultConfiguration.config.yaml` (added in
FLINK-38409), but a downstream consumer who sets it never gets it mounted — the
operator silently runs on `flink-conf.yaml` built from chart defaults, while
the user's `config.yaml` content lands in the `flink-operator-config` ConfigMap
but is never read by the operator JVM.
Three interacting causes:
1. `values.yaml` hardcodes a non-empty
`defaultConfiguration.flink-conf.yaml`. Helm's additive merge means a
downstream `-f my-values.yaml` cannot unset it, so `hasKey "flink-conf.yaml"`
is always true.
2. `templates/controller/configmap.yaml` unconditionally emits **both**
`config.yaml` and `flink-conf.yaml` keys.
3. `templates/controller/deployment.yaml` selects the mounted file with an
**independent** `hasKey "flink-conf.yaml"` — always true — so the `config.yaml`
branch is unreachable.
This PR resolves the file choice in the chart template and emits/mounts a
single, statically named `config.yaml`. The approach was discussed and agreed
on the JIRA ticket.
### Why one always-`config.yaml` file, appended from a flat seed
The chart seed carries the **mandatory Java-17 `--add-opens`/`--add-exports`
opts**, which must apply no matter which format the user supplies. So the seed
is **prepended** to the resolved override — the same "append" model
`log4j-*.properties` already uses — rather than **replaced** by it (the model
`logback-*.xml` uses, since two XML documents can't be concatenated). It cannot
use the replace model, because that would drop the Java-17 opts for anyone who
sets their own `config.yaml`.
Because the mounted result is parsed by Flink's strict `config.yaml` parser
(snakeyaml-engine, `allowDuplicateKeys=false`), an appended **flat** user
override of a key that is also in the seed would be a duplicate-key crash. Two
consequences, both reflected here:
- the seed is kept **flat** (a nested seed would instead collide with a
*nested* user override, which is the recommended form); and
- the two redundant seed keys `taskmanager.numberOfTaskSlots: 1` /
`parallelism.default: 1` are dropped — they only restate Flink's compiled
default of `1`, so removing them is behavior-neutral while shrinking the
collision surface.
## Brief change log
- **`configmap.yaml`** — emit a single config-file key, always named
`config.yaml`, resolved with a nested `{{ with (index … "config.yaml") }} … {{
else }}{{ with (index … "flink-conf.yaml") }} … {{ end }}{{ end }}` (plain
`with`/`else`, so it renders on every Helm version): the user's `config.yaml`
wins, else their `flink-conf.yaml`, else the `values.yaml` default. Keys off
value truthiness rather than `hasKey` (which downstream can't unset),
consistent with the chart's existing `log4j`/`logback` blocks.
- **`deployment.yaml`** — mount the key statically at `path: config.yaml`.
Emit and mount are both literally `config.yaml`, so they can't drift;
`config.yaml` is also the only filename Flink 2.x reads, so default installs
keep booting after the operator's eventual 2.x rebase.
- **`conf/flink-conf.yaml` → `conf/config.yaml`** (renamed) — the
always-prepended seed, named to match what it is mounted as; kept flat; the two
redundant `: 1` keys removed (see above).
- **`values.yaml`** — restructured the showcase comment into two
mutually-exclusive format examples (modern `config.yaml` / legacy
`flink-conf.yaml`) with precedence + mount notes; opinionated defaults stay
here (not relocated into the always-applied seed).
- **docs** (`operations/configuration.md`, `operations/helm.md`,
`development/guide.md`) — document the `config.yaml` format and precedence, and
fix the now-stale `/opt/flink/conf/flink-conf.yaml` inspect path.
## Verifying this change
This change added tests and can be verified as follows:
- **helm-unittest matrix**
(`tests/controller/{configmap,deployment}_test.yaml`): exactly one config-file
key, always `config.yaml` (never `flink-conf.yaml`), carrying today's effective
defaults with the two dropped seed keys absent; a user `flink-conf.yaml` still
flows in when `config.yaml` is unset; `config.yaml` wins when both are set; and
a flat override of a former seed key no longer duplicates it. The Deployment
mounts the key statically as `config.yaml`. (Values are injected via structured
`set:`; a dotted-string `set:` no-ops on keys containing dots.)
- **Parser load test**: each matrix row's rendered `config.yaml` was loaded
through a strict, no-duplicate-key YAML parser equivalent to Flink's
`config.yaml` loader, with a positive control confirming the check rejects a
genuine duplicate. All rows parse clean.
- **In a kind cluster**: installed the chart and confirmed the operator
loads `/opt/flink/conf/config.yaml` via Flink's standard YAML parser — with
scalar, nested-map, and list values — that the ConfigMap carries a single
`config.yaml` key, and that the operator reconciles FlinkDeployments with both
flat and nested `flinkConfiguration`. `ct lint` against the chart-testing image
(Helm 3.16.4) confirms the resolver renders on older Helm too.
- `helm lint` and a full `helm template` render pass.
- Effective-config equivalence for a default install: the mounted payload is
unchanged except the two dropped keys, which resolve to Flink's compiled
default of `1` either way.
## Does this pull request potentially affect one of the following parts:
- Dependencies (add or upgrade a dependency): **no**
- The public API (`CustomResourceDescriptors`): **no**
- Core observer or reconciler logic that is regularly executed: **no** (Helm
chart + docs only)
## Documentation
- Does this introduce a new feature? **No** — it makes the existing
`defaultConfiguration.config.yaml` work as advertised.
- How is it documented? **docs** — `operations/configuration.md` and
`operations/helm.md`.
---
## Notes for reviewers (call-outs / open questions)
1. **Precedence flip.** When both keys are set, `config.yaml` now wins; the
old `hasKey` behavior preferred `flink-conf.yaml`. Worth a release note.
2. **ConfigMap shape change.** `flink-operator-config` now carries one
config-file key instead of two — visible to anyone reading that ConfigMap
directly.
3. **Coordination with FLINK-39571.** That ticket's "known limitation" note
on the configuration page isn't on `main` yet; if it lands first, I'll rebase
and remove it.
4. **Residual collision surface.** A *flat* `config.yaml` override of the
two remaining `kubernetes.operator.default-configuration.flink-version.*`
Java-17 seed keys could still duplicate-key. It's rare (they're rarely
flat-overridden, and the recommended nested form is collision-free). The clean
long-term fix is shipping framework defaults as compiled `ConfigOption`
defaults rather than a text-prepended seed — proposed as a separate follow-up.
5. **Chinese docs.** The `.zh` translations (incl. the `guide.md`
inspect-path line) are intentionally left for the project's separate
Chinese-sync pass.
6. **Backports?** Which active `release-*` branch(es) should this target?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]