gyfora opened a new pull request, #614:
URL: https://github.com/apache/flink-kubernetes-operator/pull/614
## What is the purpose of the change
The goal here is to use the newly introduced declarative resource management
REST API endpoints in Flink 1.18 to execute parallelism override config changes
for job vertexes. This allows us to rescale jobs without performing costly full
upgrades (restarting all pods and resources). This will make the autoscaler
module much more stable and reliable.
The operator already supported something slightly similar for standalone
mode and reactive scheduler configuration where on TM replica change we would
simply add new replicas.
## Brief change log
### Add required REST api request/response classes for 1.18
In order to not rely on unreleased Flink 1.18 libraries, we simply copied
the request/response classes for the new rest api.
Combined with e2e tests this should be a good approach.
### Async scaling logic
The rescale mechanism in Flink 1.18 is asynchronous and might take a
non-defined time to complete. To track progress we trigger the scaling through
the `FlinkService#scale` method and afterwards update the `lastReconciledSpec`
and change the ReconciliationState to `UPGRADING`.
The observer will subsequently detect an in-progress scaling operation and
will check the vertex parallelisms using `FlinkService#scalingCompleted` to
move a `DEPLOYED` reconciliation state.
*Note: The rescale api logic is currently only implemented for the native
mode. In the standalone mode it would not work as efficiently as the user would
need to manually add more TM replicas before the scaling can succeed anyways.*
### Changes to `@SpecDiff` annotations
Previously some spec fields were marked for SCALE difftype using the
annotation, now it was required to separate what is SCALE / UPGRADE in the
different deployment types (Standalone/Native). In native mode we do not
support replica/global parallelism changes as scale operations (those rely on
standalone + reactive mode).
### Required autoscaler changes
The autoscaler previously ran under the assumption that if the job start
time did not change we could not have a change in topology/parallelisms. It
used this to detect redeployments and clear past metric history.
With the rescale api the start time can stay the same and parallelism change
so we need to clear the collected metrics when this happens. For this we always
query the topology and compare the current parallelisms.
We also change the base validation of a RUNNING state to a Running stable
state (not in upgrading anymore).
## Verifying this change
- Manual testing using the autoscaler example on local k8s cluster
- New unit tests added for:
- DiffType / SpecDiff changes
- Native Flink service rescale logic
- Reconciler changes for triggering rescale and updating status
- Observer changes for detecting compeleted scale operations
- **TODO: Autoscaler changes**
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changes to the `CustomResourceDescriptors`:
no
- Core observer or reconciler logic that is regularly executed: yes
## Documentation
- Does this pull request introduce a new feature? yes
- If yes, how is the feature documented? TODO
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]