(flink-kubernetes-operator) branch main updated: [FLINK-33102][autoscaler] Document the autoscaler standalone and Extensibility of Autoscaler

gyfora Fri, 10 Nov 2023 00:06:03 -0800

This is an automated email from the ASF dual-hosted git repository.

gyfora pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/flink-kubernetes-operator.git



The following commit(s) were added to refs/heads/main by this push:
     new e3681982 [FLINK-33102][autoscaler] Document the autoscaler standalone 
and Extensibility of Autoscaler
e3681982 is described below

commit e36819820627dedaca66d33f85687166ac829395
Author: Rui Fan <[email protected]>
AuthorDate: Fri Nov 10 16:05:54 2023 +0800

    [FLINK-33102][autoscaler] Document the autoscaler standalone and 
Extensibility of Autoscaler
---
 docs/content/docs/custom-resource/autoscaler.md | 88 +++++++++++++++++++++++++
 flink-autoscaler/README.md                      |  2 +-
 2 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/docs/content/docs/custom-resource/autoscaler.md 
b/docs/content/docs/custom-resource/autoscaler.md
index 7f67c1a3..a73aecd9 100644
--- a/docs/content/docs/custom-resource/autoscaler.md
+++ b/docs/content/docs/custom-resource/autoscaler.md
@@ -197,6 +197,94 @@ The list of options will likely grow to cover more complex 
scaling scenarios.
 
 For a detailed config reference check the [general configuration page]({{< ref 
"docs/operations/configuration#autoscaler-configuration" >}})
 
+## Extensibility of Autoscaler
+
+The Autoscaler exposes a set of interfaces for storing autoscaler state, 
handling autoscaling events, 
+and executing scaling decisions. How these are implemented is specific to the 
orchestration framework 
+used (e.g. Kubernetes), but the interfaces are designed to be as generic as 
possible. The following 
+are the introduction of these generic interfaces:
+
+- **AutoScalerEventHandler** : Handling autoscaler events, such as: 
ScalingReport,
+  AutoscalerError, etc. `LoggingEventHandler` is the default implementation, 
it logs events.
+- **AutoScalerStateStore** : Storing all state during scaling. 
`InMemoryAutoScalerStateStore` is 
+  the default implementation, it's based on the Java Heap, so the state will 
be discarded after 
+  process restarts. We will implement persistent State Store in the future, 
such as : `JdbcAutoScalerStateStore`.
+- **ScalingRealizer** : Applying scaling actions.
+- **JobAutoScalerContext** : Including all details related to the current job.
+
+## Autoscaler Standalone
+
+**Flink Autoscaler Standalone** is an implementation of **Flink Autoscaler**, 
it runs as a separate java 
+process. It computes the reasonable parallelism of all job vertices by 
monitoring the metrics, such as: 
+processing rate, busy time, etc.
+
+Flink Autoscaler Standalone rescales flink job in-place by rest api of 
+[Externalized Declarative Resource 
Management](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/elastic_scaling/#externalized-declarative-resource-management).
+`RescaleApiScalingRealizer` is the default implementation of 
`ScalingRealizer`, it uses the Rescale API 
+to apply parallelism changes.
+
+Kubernetes Operator is well integrated with Autoscaler, we strongly recommend 
using Kubernetes Operator 
+directly for the kubernetes flink jobs, and only flink jobs in non-kubernetes 
environments use Autoscaler 
+Standalone.
+
+### How To Use Autoscaler Standalone
+
+Currently, `Flink Autoscaler Standalone` only supports a single Flink cluster. 
It can be any type of 
+Flink cluster, includes:
+
+- Flink Standalone Cluster
+- MiniCluster
+- Flink yarn session cluster
+- Flink yarn application cluster
+- Flink kubernetes session cluster
+- Flink kubernetes application cluster
+- etc
+
+You can start a Flink Streaming job with the following ConfigOptions.
+
+```
+# Enable Adaptvie scheduler to play the in-place rescaling.
+jobmanager.scheduler : adaptive
+
+# Enable autoscale and scaling
+job.autoscaler.enabled : true
+job.autoscaler.scaling.enabled : true
+job.autoscaler.stabilization.interval : 1m
+job.autoscaler.metrics.window : 3m
+```
+
+> Note: In-place rescaling is only supported since Flink 1.18. Flink jobs 
before version
+> 1.18 cannot be scaled automatically, but you can view the `ScalingReport` in 
Log.
+> `ScalingReport` will show the recommended parallelism for each vertex.
+
+After the flink job starts, please start the StandaloneAutoscaler process by 
the
+following command.
+
+```
+java -cp flink-autoscaler-standalone-{{< version >}}.jar \
+org.apache.flink.autoscaler.standalone.StandaloneAutoscalerEntrypoint \
+--flinkClusterHost localhost \
+--flinkClusterPort 8081
+```
+
+Updating the `flinkClusterHost` and `flinkClusterPort` based on your flink 
cluster.
+In general, the host and port are the same as Flink WebUI.
+
+### Extensibility of autoscaler standalone
+
+Please click [here]({{< ref 
"docs/custom-resource/autoscaler#extensibility-of-autoscaler" >}}) 
+to check out extensibility of generic autoscaler.
+
+`Autoscaler Standalone` isn't responsible for job management, so it doesn't 
have job information.
+`Autoscaler Standalone` defines the `JobListFetcher` interface in order to get 
the
+`JobAutoScalerContext` of the job. It has a control loop that periodically 
calls
+`JobListFetcher#fetch` to fetch the job list and scale these jobs.
+
+Currently `FlinkClusterJobListFetcher` is the only implementation of the 
`JobListFetcher`
+interface, that's why `Flink Autoscaler Standalone` only supports a single 
Flink cluster so far.
+We will implement `YarnJobListFetcher` in the future, `Flink Autoscaler 
Standalone` will call
+`YarnJobListFetcher#fetch` to fetch job list from yarn cluster periodically.
+
 ## Metrics
 
 The operator reports detailed jobvertex level metrics about the evaluated 
Flink job metrics that are collected and used in the scaling decision.
diff --git a/flink-autoscaler/README.md b/flink-autoscaler/README.md
index c15ee1f0..9e28870a 100644
--- a/flink-autoscaler/README.md
+++ b/flink-autoscaler/README.md
@@ -27,7 +27,7 @@ designed to be as generic as possible. The following are the 
introduction of the
 generic interfaces:
 
 - **AutoScalerEventHandler** : Handling autoscaler events, such as: 
ScalingReport,
-  AutoscalerError, etc. `LoggableEventHandler` is the default implementation, 
it logs events.
+  AutoscalerError, etc. `LoggingEventHandler` is the default implementation, 
it logs events.
 - **AutoScalerStateStore** : Storing all state during scaling. 
`InMemoryAutoScalerStateStore`
   is the default implementation, it's based on the Java Heap, so the state 
will be discarded
   after process restarts. We will implement persistent State Store in the 
future, such as

(flink-kubernetes-operator) branch main updated: [FLINK-33102][autoscaler] Document the autoscaler standalone and Extensibility of Autoscaler

Reply via email to