gyfora opened a new pull request, #614:
URL: https://github.com/apache/flink-kubernetes-operator/pull/614

   ## What is the purpose of the change
   
   The goal here is to use the newly introduced declarative resource management 
REST API endpoints in Flink 1.18 to execute parallelism override config changes 
for job vertexes. This allows us to rescale jobs without performing costly full 
upgrades (restarting all pods and resources). This will make the autoscaler 
module much more stable and reliable.
   
   The operator already supported something slightly similar for standalone 
mode and reactive scheduler configuration where on TM replica change we would 
simply add new replicas.
   
   ## Brief change log
   
   ### Add required REST api request/response classes for 1.18
   
   In order to not rely on unreleased Flink 1.18 libraries, we simply copied 
the request/response classes for the new rest api.
   Combined with e2e tests this should be a good approach.
   
   ### Async scaling logic
   
   The rescale mechanism in Flink 1.18 is asynchronous and might take a 
non-defined time to complete. To track progress we trigger the scaling through 
the `FlinkService#scale` method and afterwards update the `lastReconciledSpec` 
and change the ReconciliationState to `UPGRADING`.
   
   The observer will subsequently detect an in-progress scaling operation and 
will check the vertex parallelisms using `FlinkService#scalingCompleted` to 
move a `DEPLOYED` reconciliation state.
   
   *Note: The rescale api logic is currently only implemented for the native 
mode. In the standalone mode it would not work as efficiently as the user would 
need to manually add more TM replicas before the scaling can succeed anyways.*
   
   ### Changes to `@SpecDiff` annotations
   
   Previously some spec fields were marked for SCALE difftype using the 
annotation, now it was required to separate what is SCALE / UPGRADE in the 
different deployment types (Standalone/Native). In native mode we do not 
support replica/global parallelism changes as scale operations (those rely on 
standalone + reactive mode).
   
   ### Required autoscaler changes
   
   The autoscaler previously ran under the assumption that if the job start 
time did not change we could not have a change in topology/parallelisms. It 
used this to detect redeployments and clear past metric history.
   
   With the rescale api the start time can stay the same and parallelism change 
so we need to clear the collected metrics when this happens. For this we always 
query the topology and compare the current parallelisms.
   
   We also change the base validation of a RUNNING state to a Running stable 
state (not in upgrading anymore).
   
   ## Verifying this change
   
    - Manual testing using the autoscaler example on local k8s cluster
    - New unit tests added for:
     - DiffType / SpecDiff changes
     - Native Flink service rescale logic
     - Reconciler changes for triggering rescale and updating status
     - Observer changes for detecting compeleted scale operations
     - **TODO: Autoscaler changes**
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
no
     - Core observer or reconciler logic that is regularly executed: yes
   
   ## Documentation
   
     - Does this pull request introduce a new feature? yes
     - If yes, how is the feature documented? TODO
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to