[ 
https://issues.apache.org/jira/browse/FLINK-39791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18085366#comment-18085366
 ] 

Gerk Elznik commented on FLINK-39791:
-------------------------------------

Thanks again for the guidance, [~dciupitu], I've opened the PR (linked on this 
issue as #1126). 

Implementation: the chart resolves the file in the template and always emits 
and mounts a single {{config.yaml}} ({{config.yaml}} > legacy 
{{flink-conf.yaml}} > the values.yaml default), with the seed renamed to 
{{conf/config.yaml}} and the two redundant default entries 
({{taskmanager.numberOfTaskSlots}} / {{parallelism.default}}) dropped. 
helm-unittest, a strict-parser load test, and a kind-cluster install all pass 
locally.

Could you assign this issue to me?  It'd be my first contribution to the 
project.  Happy to address any feedback — thanks!

> Helm chart silently ignores defaultConfiguration.config.yaml; always mounts 
> flink-conf.yaml from chart defaults
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-39791
>                 URL: https://issues.apache.org/jira/browse/FLINK-39791
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.14.0, kubernetes-operator-1.15.0
>            Reporter: Gerk Elznik
>            Priority: Major
>              Labels: helm, pull-request-available
>
> h2. Summary
> The Helm chart at {{helm/flink-kubernetes-operator/}} exposes a 
> {{defaultConfiguration.config.yaml}} parameter (showcased in {{values.yaml}} 
> per FLINK-38409 / PR #1022), but a downstream Helm consumer who sets it 
> cannot actually get {{config.yaml}} mounted into the operator pod. The chart 
> silently continues to mount {{{}flink-conf.yaml{}}}, populated only with the 
> chart's hardcoded defaults. The user's {{config.yaml}} content lands in the 
> {{flink-operator-config}} ConfigMap under the {{config.yaml}} key but is 
> never read by the operator JVM.
> h2. Root cause
> Two design choices interact:
> *1. {{helm/flink-kubernetes-operator/values.yaml}} hardcodes 
> {{defaultConfiguration.flink-conf.yaml}}* as a non-empty string (Flink-1.x 
> StatsD/SLF4J reporter config):
> {code:yaml}
> defaultConfiguration:   ...
>   flink-conf.yaml: |+
>     # Flink Config Overrides
>     kubernetes.operator.metrics.reporter.slf4j.factory.class: 
> org.apache.flink.metrics.slf4j.Slf4jReporterFactory
>     kubernetes.operator.metrics.reporter.slf4j.interval: 5 MINUTE
>     kubernetes.operator.reconcile.interval: 15 s
>     kubernetes.operator.observer.progress-check.interval: 5 s
> {code}
> *2. {{helm/flink-kubernetes-operator/templates/controller/deployment.yaml}} 
> selects the mounted file with:*
> {code:java}
> {{- if hasKey .Values.defaultConfiguration "flink-conf.yaml" }}
>   - key: flink-conf.yaml
>     path: flink-conf.yaml
> {{- else }}
>   - key: config.yaml
>     path: config.yaml
> {{- end }}
> {code}
> Helm value merging is additive on maps – a downstream -{{{}f 
> my-values.yaml{}}} or {{-set}} cannot remove a key that exists in the chart's 
> own {{{}values.yaml{}}}. Setting {{{}defaultConfiguration.flink-conf.yaml: 
> null{}}}, {{{}~{}}}, or {{""}} in a downstream
> values file still leaves the key present in {{.Values.defaultConfiguration}} 
> for {{hasKey}} purposes. So {{hasKey}} is always true and the {{config.yaml}} 
> branch of the conditional is unreachable for any normal consumer of the 
> published chart.
> h3. Compounding issue
> {{templates/controller/configmap.yaml}} unconditionally emits *both* 
> {{config.yaml}} and {{flink-conf.yaml}} keys into the rendered ConfigMap. So 
> even if a user works around the mount selection, the ConfigMap contains two 
> divergent files and the
> wrong one is authoritative based on what the Deployment volume's {{items}} 
> references.
> h3. Misleading showcase
> FLINK-38409 / PR #1022 added this comment to {{{}values.yaml{}}}:
> {quote} # Uncomment to use the new config.yaml format, but also comment out 
> the flink-config.yaml key.{quote}
> The instruction "comment out the flink-conf.yaml key" silently requires 
> editing the *chart's* own {{{}values.yaml{}}}, which downstream Helm 
> consumers cannot do without forking. The PR landed an undocumented workaround 
> as a documented feature.
> h2. Reproduction
>  # Install or {{helm template}} the published 1.15.0 chart with downstream 
> values that set {{defaultConfiguration.config.yaml}} to some Flink-2.x 
> structured-YAML content. Do *not* set 
> {{{}defaultConfiguration.flink-conf.yaml{}}}.
>  # Inspect the rendered Deployment: the 
> {{{}flink-operator-config-volume{}}}'s {{items}} references 
> {{{}flink-conf.yaml{}}}, not {{{}config.yaml{}}}.
>  # Inspect the rendered ConfigMap {{{}flink-operator-config{}}}: both 
> {{data.config.yaml}} and {{data.flink-conf.yaml}} are present. The user's 
> content is under {{{}config.yaml{}}}; the upstream defaults are under 
> {{{}flink-conf.yaml{}}}.
>  # The pod reads {{/opt/flink/conf/flink-conf.yaml}} and runs on upstream 
> defaults; the user's overrides are silently inert.
> h2. Expected behavior
> A user who provides {{defaultConfiguration.config.yaml}} and nothing for 
> {{defaultConfiguration.flink-conf.yaml}} should get:
>  * A Deployment that mounts {{{}config.yaml{}}}.
>  * A ConfigMap that does not include a stale, ignored {{flink-conf.yaml}} key 
> (or, if it must for back-compat, an explicitly empty one).
> h2. Suggested fixes (any one of)
>  # *Invert priority* in {{templates/controller/deployment.yaml}} – prefer 
> {{config.yaml}} when the user has explicitly populated it, fall back to 
> {{flink-conf.yaml}} only otherwise. Lossless for legacy users; gives 
> precedence to the explicit
> modern-format opt-in. Mirror the same priority in 
> {{templates/controller/configmap.yaml}} so only one key is emitted.
>  # *Drop the hardcoded default* for {{defaultConfiguration.flink-conf.yaml}} 
> in {{{}values.yaml{}}}. The same content is loaded from 
> {{conf/flink-conf.yaml}} via {{{}defaultConfiguration.append: true{}}}, so 
> nothing is lost; {{hasKey}} then actually
> discriminates based on user intent.
>  # *Treat empty values as absent* – replace {{hasKey ...}} with a check on 
> the resolved value (e.g. {{{}(index .Values.defaultConfiguration 
> "flink-conf.yaml"){}}}). Less ergonomic than #1 but the smallest template 
> change; lets downstream users
> effectively unset via {{{}flink-conf.yaml: ""{}}}.
> h2. Workaround
> Apply Kustomize patches downstream after {{{}helm template{}}}:
>  # {{op: remove}} the {{/data/flink-conf.yaml}} key from the rendered 
> ConfigMap.
>  # Strategic-merge the Deployment volume's {{items}} to reference 
> {{config.yaml}} plus the log4j (or logback) files.
> h2. Related tickets
>  * FLINK-35744 ("Support flink 1.19 config.yaml format", fixed in 1.12.0) – 
> added {{config.yaml}} support generally; did not address downstream override.
>  * FLINK-38409 ("Showcase in Helm chart values.yaml that config.yaml can be 
> used", merged via PR #1022) – added the misleading showcase comment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to