Hi everyone,

We would like to start a discussion on whether users should be permitted to
directly modify a FlinkDeployment that is owned by a
FlinkBlueGreenDeployment.

Currently, direct changes to a FlinkDeployment (such as updating
parallelism) trigger an immediate job restart rather than a blue-green
transition. Because these edits are not propagated back to the BlueGreen
CR, they cause configuration drift. We propose making the BlueGreen CR the
single source of truth by preventing/rejecting direct user edits to owned
FlinkDeployments. This restriction would not apply to independent
FlinkDeployments, and a "break glass" or opt-out mechanism would be
included.

If you agree this is worth exploring, we have identified two implementation
options:

In both cases, the check would verify the owner references of the
FlinkDeployment to ensure the owner is a FlinkBlueGreenDeployment and check
if the requester is the operator's service account.

1. Admission Webhook: Update the /validate endpoint to perform this
validation alongside existing spec checks.
2. Kubernetes ValidatingAdmissionPolicy: Use a native K8s policy to perform
the check (
https://kubernetes.io/docs/reference/access-authn-authz/validating-admission-policy/
).

We currently favor the ValidatingAdmissionPolicy approach. While the
/validate endpoint is typically for spec validation, this change is
specifically about access control. A policy also ensures enforcement if
webhooks are disabled or unavailable and offers a simple opt-out/opt-in
mechanism at the Helm level specific to this ability.

We would love to hear the community's thoughts on this proposal!

Thanks,
James

Reply via email to