gerkElznik opened a new pull request, #1126:
URL: https://github.com/apache/flink-kubernetes-operator/pull/1126

   ## What is the purpose of the change
   
   The Helm chart advertises `defaultConfiguration.config.yaml` (added in 
FLINK-38409), but a downstream consumer who sets it never gets it mounted — the 
operator silently runs on `flink-conf.yaml` built from chart defaults, while 
the user's `config.yaml` content lands in the `flink-operator-config` ConfigMap 
but is never read by the operator JVM.
   
   Three interacting causes:
   
   1. `values.yaml` hardcodes a non-empty 
`defaultConfiguration.flink-conf.yaml`. Helm's additive merge means a 
downstream `-f my-values.yaml` cannot unset it, so `hasKey "flink-conf.yaml"` 
is always true.
   2. `templates/controller/configmap.yaml` unconditionally emits **both** 
`config.yaml` and `flink-conf.yaml` keys.
   3. `templates/controller/deployment.yaml` selects the mounted file with an 
**independent** `hasKey "flink-conf.yaml"` — always true — so the `config.yaml` 
branch is unreachable.
   
   This PR resolves the file choice in the chart template and emits/mounts a 
single, statically named `config.yaml`. The approach was discussed and agreed 
on the JIRA ticket.
   
   ### Why one always-`config.yaml` file, appended from a flat seed
   
   The chart seed carries the **mandatory Java-17 `--add-opens`/`--add-exports` 
opts**, which must apply no matter which format the user supplies. So the seed 
is **prepended** to the resolved override — the same "append" model 
`log4j-*.properties` already uses — rather than **replaced** by it (the model 
`logback-*.xml` uses, since two XML documents can't be concatenated). It cannot 
use the replace model, because that would drop the Java-17 opts for anyone who 
sets their own `config.yaml`.
   
   Because the mounted result is parsed by Flink's strict `config.yaml` parser 
(snakeyaml-engine, `allowDuplicateKeys=false`), an appended **flat** user 
override of a key that is also in the seed would be a duplicate-key crash. Two 
consequences, both reflected here:
   - the seed is kept **flat** (a nested seed would instead collide with a 
*nested* user override, which is the recommended form); and
   - the two redundant seed keys `taskmanager.numberOfTaskSlots: 1` / 
`parallelism.default: 1` are dropped — they only restate Flink's compiled 
default of `1`, so removing them is behavior-neutral while shrinking the 
collision surface.
   
   ## Brief change log
   
   - **`configmap.yaml`** — emit a single config-file key, always named 
`config.yaml`, resolved with a nested `{{ with (index … "config.yaml") }} … {{ 
else }}{{ with (index … "flink-conf.yaml") }} … {{ end }}{{ end }}` (plain 
`with`/`else`, so it renders on every Helm version): the user's `config.yaml` 
wins, else their `flink-conf.yaml`, else the `values.yaml` default. Keys off 
value truthiness rather than `hasKey` (which downstream can't unset), 
consistent with the chart's existing `log4j`/`logback` blocks.
   - **`deployment.yaml`** — mount the key statically at `path: config.yaml`. 
Emit and mount are both literally `config.yaml`, so they can't drift; 
`config.yaml` is also the only filename Flink 2.x reads, so default installs 
keep booting after the operator's eventual 2.x rebase.
   - **`conf/flink-conf.yaml` → `conf/config.yaml`** (renamed) — the 
always-prepended seed, named to match what it is mounted as; kept flat; the two 
redundant `: 1` keys removed (see above).
   - **`values.yaml`** — restructured the showcase comment into two 
mutually-exclusive format examples (modern `config.yaml` / legacy 
`flink-conf.yaml`) with precedence + mount notes; opinionated defaults stay 
here (not relocated into the always-applied seed).
   - **docs** (`operations/configuration.md`, `operations/helm.md`, 
`development/guide.md`) — document the `config.yaml` format and precedence, and 
fix the now-stale `/opt/flink/conf/flink-conf.yaml` inspect path.
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   - **helm-unittest matrix** 
(`tests/controller/{configmap,deployment}_test.yaml`): exactly one config-file 
key, always `config.yaml` (never `flink-conf.yaml`), carrying today's effective 
defaults with the two dropped seed keys absent; a user `flink-conf.yaml` still 
flows in when `config.yaml` is unset; `config.yaml` wins when both are set; and 
a flat override of a former seed key no longer duplicates it. The Deployment 
mounts the key statically as `config.yaml`. (Values are injected via structured 
`set:`; a dotted-string `set:` no-ops on keys containing dots.)
   - **Parser load test**: each matrix row's rendered `config.yaml` was loaded 
through a strict, no-duplicate-key YAML parser equivalent to Flink's 
`config.yaml` loader, with a positive control confirming the check rejects a 
genuine duplicate. All rows parse clean.
   - **In a kind cluster**: installed the chart and confirmed the operator 
loads `/opt/flink/conf/config.yaml` via Flink's standard YAML parser — with 
scalar, nested-map, and list values — that the ConfigMap carries a single 
`config.yaml` key, and that the operator reconciles FlinkDeployments with both 
flat and nested `flinkConfiguration`. `ct lint` against the chart-testing image 
(Helm 3.16.4) confirms the resolver renders on older Helm too.
   - `helm lint` and a full `helm template` render pass.
   - Effective-config equivalence for a default install: the mounted payload is 
unchanged except the two dropped keys, which resolve to Flink's compiled 
default of `1` either way.
   
   ## Does this pull request potentially affect one of the following parts:
   
   - Dependencies (add or upgrade a dependency): **no**
   - The public API (`CustomResourceDescriptors`): **no**
   - Core observer or reconciler logic that is regularly executed: **no** (Helm 
chart + docs only)
   
   ## Documentation
   
   - Does this introduce a new feature? **No** — it makes the existing 
`defaultConfiguration.config.yaml` work as advertised.
   - How is it documented? **docs** — `operations/configuration.md` and 
`operations/helm.md`.
   
   ---
   
   ## Notes for reviewers (call-outs / open questions)
   
   1. **Precedence flip.** When both keys are set, `config.yaml` now wins; the 
old `hasKey` behavior preferred `flink-conf.yaml`. Worth a release note.
   2. **ConfigMap shape change.** `flink-operator-config` now carries one 
config-file key instead of two — visible to anyone reading that ConfigMap 
directly.
   3. **Coordination with FLINK-39571.** That ticket's "known limitation" note 
on the configuration page isn't on `main` yet; if it lands first, I'll rebase 
and remove it.
   4. **Residual collision surface.** A *flat* `config.yaml` override of the 
two remaining `kubernetes.operator.default-configuration.flink-version.*` 
Java-17 seed keys could still duplicate-key. It's rare (they're rarely 
flat-overridden, and the recommended nested form is collision-free). The clean 
long-term fix is shipping framework defaults as compiled `ConfigOption` 
defaults rather than a text-prepended seed — proposed as a separate follow-up.
   5. **Chinese docs.** The `.zh` translations (incl. the `guide.md` 
inspect-path line) are intentionally left for the project's separate 
Chinese-sync pass.
   6. **Backports?** Which active `release-*` branch(es) should this target?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to