gyfora opened a new pull request, #439:
URL: https://github.com/apache/flink-kubernetes-operator/pull/439
## What is the purpose of the change
Currently the operator uses `patchStatus()` without any resource version
locking to update the Flink resources. This means that the operator overwrites
whatever status was there previously. This was convenient so far but with
leader election it is much more likely to get into a situation where this can
be problematic.
This change introduces optimistic locking for the status which detects if
someone externally modified the status without the reconciliation logic being
aware of it.
There are 2 main problems that we target here:
1. Status updates by zombie operators who has last lost leadership but not
realized/dead yet.
2. Stale status received when a new leader starts
**Why would these happen?**
Zombie operator:
It could in theory happen that an operator loses leadership in a middle of
reconciliation due to a very long GC pause (or some network issue or whatever)
and the current CR reconcile loop continues while the new leader already
started to reconcile this resource. This is very unlikely but can happen with
leader election and a standby operator. In these cases we don't want to allow
the old operator who lost leadership to be able to make any status updates. The
new logic guarantees that if the new leader made any status update the old
would never be able to do so again.
Stale status:
When the new leader starts processing (if it was on standby) there is no
guarantee that the status/spec reconciled at the first time is up to date. This
can happen because due to some unlucky cache update timing or even a zombie
operator submitting late status updates. The current operator logic very much
relies on seeing the last status otherwise we can have some very weird
cornercases that would definitely cause problems for the resources.
**How the new logic tackles this in a safe way**
What the new logic does is that it basically only allows status updates to
go through when the operator has the latest status information. So it's sort of
a locking on the current status. If anyone else changed the status in the
meantime, we simply throw an error and retrigger the reconciliation. This is
actually safe to do as the operator reconcile logic already runs with the
assumption that the operator can fail at any time before status update, and we
always use the status as a "write-ahead-log" of the actions we are taking. In
these cases zombie operators who have already lost leadership would never
reconcile again (the leader election guarantees that), and in other cases this
would give us the latest version of the resource.
*Note:*
It's not easy to write unit tests that validate the logic as the fabric8
mockserver/utilities do not work well with optimistic locking. See
https://github.com/fabric8io/kubernetes-client/issues/4573 for details.
## Brief change log
- *Add optimistic locking for status updates*
- *Remove empty strings from status*
- *Added new integration test*
## Verifying this change
- Extended the operator IT case
- Manually verified on minikube
- e2es
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changes to the `CustomResourceDescriptors`:
no
- Core observer or reconciler logic that is regularly executed: yes
## Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? not applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]