[GitHub] [dolphinscheduler] EricGao888 commented on a diff in pull request #13367: [Feature][Deployment] Add KEDA autoscaler support for worker deployment when deployed in K8S cluster

GitBox Fri, 13 Jan 2023 20:34:24 -0800


EricGao888 commented on code in PR #13367:
URL: 
https://github.com/apache/dolphinscheduler/pull/13367#discussion_r1070214289



##########
deploy/kubernetes/dolphinscheduler/templates/keda-autoscaler-worker.yaml:
##########
@@ -0,0 +1,86 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+################################
+## DolphinScheduler Worker KEDA Scaler
+#################################
+{{- if and .Values.worker.keda.enabled }}
+apiVersion: keda.sh/v1alpha1
+kind: ScaledObject
+metadata:
+  name: {{ include "dolphinscheduler.fullname" . }}-worker
+  labels:
+    component: worker-horizontalpodautoscaler
+    deploymentName: {{ include "dolphinscheduler.fullname" . }}-worker
+spec:
+  scaleTargetRef:
+    kind: StatefulSet
+    name: {{ include "dolphinscheduler.fullname" . }}-worker
+  pollingInterval:  {{ .Values.worker.keda.pollingInterval }}
+  cooldownPeriod: {{ .Values.worker.keda.cooldownPeriod }}
+  minReplicaCount: {{ .Values.worker.keda.minReplicaCount }}
+  maxReplicaCount: {{ .Values.worker.keda.maxReplicaCount }}
+  {{- if .Values.worker.keda.advanced }}
+  advanced:
+    {{ toYaml .Values.worker.keda.advanced | indent 4 }}
+    {{- end }}
+  # This is just an example, you could customize the trigger rule.
+  # FYI, check TaskExecutionStatus.java for the human-readable meaning of 
state values below.
+  triggers:
+    {{- if .Values.postgresql.enabled }}
+    - type: postgresql
+      metadata:
+        host: {{ template "dolphinscheduler.postgresql.fullname" . }}.{{ 
.Release.Namespace }}.svc.cluster.local
+        port: "5432"
+        userName: {{ .Values.postgresql.postgresqlUsername }}
+        passwordFromEnv: SPRING_DATASOURCE_PASSWORD
+        dbName: {{ .Values.postgresql.postgresqlDatabase }}
+        sslmode: "disable"
+        targetQueryValue: "1"
+        query: >-
+          SELECT ceil(COUNT(*)::decimal / {{ 
.Values.worker.env.WORKER_EXEC_THREADS }})

Review Comment:
   Good catch. I don't have a solution at this moment either. I will add a 
reminder in docs for users about this.



##########
deploy/kubernetes/dolphinscheduler/templates/keda-autoscaler-worker.yaml:
##########
@@ -0,0 +1,86 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+################################
+## DolphinScheduler Worker KEDA Scaler
+#################################
+{{- if and .Values.worker.keda.enabled }}
+apiVersion: keda.sh/v1alpha1
+kind: ScaledObject
+metadata:
+  name: {{ include "dolphinscheduler.fullname" . }}-worker
+  labels:
+    component: worker-horizontalpodautoscaler
+    deploymentName: {{ include "dolphinscheduler.fullname" . }}-worker
+spec:
+  scaleTargetRef:
+    kind: StatefulSet
+    name: {{ include "dolphinscheduler.fullname" . }}-worker
+  pollingInterval:  {{ .Values.worker.keda.pollingInterval }}
+  cooldownPeriod: {{ .Values.worker.keda.cooldownPeriod }}
+  minReplicaCount: {{ .Values.worker.keda.minReplicaCount }}
+  maxReplicaCount: {{ .Values.worker.keda.maxReplicaCount }}
+  {{- if .Values.worker.keda.advanced }}
+  advanced:
+    {{ toYaml .Values.worker.keda.advanced | indent 4 }}
+    {{- end }}
+  # This is just an example, you could customize the trigger rule.
+  # FYI, check TaskExecutionStatus.java for the human-readable meaning of 
state values below.
+  triggers:
+    {{- if .Values.postgresql.enabled }}
+    - type: postgresql
+      metadata:
+        host: {{ template "dolphinscheduler.postgresql.fullname" . }}.{{ 
.Release.Namespace }}.svc.cluster.local
+        port: "5432"
+        userName: {{ .Values.postgresql.postgresqlUsername }}
+        passwordFromEnv: SPRING_DATASOURCE_PASSWORD
+        dbName: {{ .Values.postgresql.postgresqlDatabase }}
+        sslmode: "disable"
+        targetQueryValue: "1"
+        query: >-
+          SELECT ceil(COUNT(*)::decimal / {{ 
.Values.worker.env.WORKER_EXEC_THREADS }})
+          FROM t_ds_task_instance
+          WHERE state IN (0, 1, 8, 12, 17)
+    {{- else if .Values.mysql.enabled }}
+    - type: mysql
+      metadata:
+        host: {{ template "dolphinscheduler.mysql.fullname" . }}.{{ 
.Release.Namespace }}.svc.cluster.local
+        port: "3306"
+        dbName: {{ .Values.mysql.auth.database }}
+        username: {{ .Values.mysql.auth.username }}
+        passwordFromEnv: SPRING_DATASOURCE_PASSWORD
+        queryValue: "1"
+        query: >-
+          SELECT CEIL(COUNT(*) / {{ .Values.worker.env.WORKER_EXEC_THREADS }})
+          FROM t_ds_task_instance
+          WHERE state IN (0, 1, 8, 12, 17)
+    {{- else if .Values.externalDatabase.enabled }}
+    - type: mysql

Review Comment:
   > An `externalDatabase` is not necessarily `mysql` 😅 it can be postgresql as 
well, you might want to use
   
   I will use a condition clause here to support both `postgresql` and `mysql`. 
 As these two use different autoscalers and there are tiny differences. 
   



##########
deploy/kubernetes/dolphinscheduler/templates/keda-autoscaler-worker.yaml:
##########
@@ -0,0 +1,86 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+################################
+## DolphinScheduler Worker KEDA Scaler
+#################################
+{{- if and .Values.worker.keda.enabled }}
+apiVersion: keda.sh/v1alpha1
+kind: ScaledObject
+metadata:
+  name: {{ include "dolphinscheduler.fullname" . }}-worker
+  labels:
+    component: worker-horizontalpodautoscaler
+    deploymentName: {{ include "dolphinscheduler.fullname" . }}-worker
+spec:
+  scaleTargetRef:
+    kind: StatefulSet
+    name: {{ include "dolphinscheduler.fullname" . }}-worker
+  pollingInterval:  {{ .Values.worker.keda.pollingInterval }}
+  cooldownPeriod: {{ .Values.worker.keda.cooldownPeriod }}
+  minReplicaCount: {{ .Values.worker.keda.minReplicaCount }}
+  maxReplicaCount: {{ .Values.worker.keda.maxReplicaCount }}
+  {{- if .Values.worker.keda.advanced }}
+  advanced:
+    {{ toYaml .Values.worker.keda.advanced | indent 4 }}
+    {{- end }}
+  # This is just an example, you could customize the trigger rule.
+  # FYI, check TaskExecutionStatus.java for the human-readable meaning of 
state values below.
+  triggers:
+    {{- if .Values.postgresql.enabled }}
+    - type: postgresql
+      metadata:
+        host: {{ template "dolphinscheduler.postgresql.fullname" . }}.{{ 
.Release.Namespace }}.svc.cluster.local
+        port: "5432"
+        userName: {{ .Values.postgresql.postgresqlUsername }}
+        passwordFromEnv: SPRING_DATASOURCE_PASSWORD
+        dbName: {{ .Values.postgresql.postgresqlDatabase }}
+        sslmode: "disable"
+        targetQueryValue: "1"
+        query: >-
+          SELECT ceil(COUNT(*)::decimal / {{ 
.Values.worker.env.WORKER_EXEC_THREADS }})
+          FROM t_ds_task_instance
+          WHERE state IN (0, 1, 8, 12, 17)
+    {{- else if .Values.mysql.enabled }}
+    - type: mysql
+      metadata:
+        host: {{ template "dolphinscheduler.mysql.fullname" . }}.{{ 
.Release.Namespace }}.svc.cluster.local
+        port: "3306"
+        dbName: {{ .Values.mysql.auth.database }}
+        username: {{ .Values.mysql.auth.username }}
+        passwordFromEnv: SPRING_DATASOURCE_PASSWORD
+        queryValue: "1"

Review Comment:
   > This key `queryValue` is different from the one in postgresql 
`targetQueryValue`, is that expected?
   
   Yes, it is. As you could see in the following source code, `postgresql` and 
`mysql` autoscalers have different names for the same parameter.
   
   
https://github.com/kedacore/keda/blob/c1611482226aae17b8af1a575f1a629f0c912bd1/pkg/scalers/postgresql_scaler.go#L181-L193
   
   
https://github.com/kedacore/keda/blob/c1611482226aae17b8af1a575f1a629f0c912bd1/pkg/scalers/mysql_scaler.go#L211-L223
   



##########
docs/docs/en/guide/installation/kubernetes.md:
##########
@@ -87,6 +87,40 @@ $ kubectl delete pvc -l 
app.kubernetes.io/instance=dolphinscheduler
 
 > **Note**: Deleting the PVC's will delete all data as well. Please be 
 > cautious before doing it.
 
+## [Experimental] Worker Autoscaling
+
+> **Warning**: Currently this is an experimental feature and may not be 
suitable for production!
+
+`DolphinScheduler` uses [KEDA](https://github.com/kedacore/keda) for worker 
autoscaling. However, `DolphinScheduler` disables
+this feature by default. To turn on worker autoscaling:
+
+Firstly, you need to create a namespace for `KEDA` and install it with `helm`:
+
+```bash
+helm repo add kedacore https://kedacore.github.io/charts
+
+helm repo update
+
+kubectl create namespace keda
+
+helm install keda kedacore/keda \
+    --namespace keda \
+    --version "v2.0.0"
+```
+
+Secondly, you need to set `worker.keda.enabled` to `true` in `values.yaml` or 
install the chart by:
+
+```bash
+helm install dolphinscheduler . --set worker.keda.enabled=true -n 
<your-namespace-to-deploy-dolphinscheduler>
+```
+
+Once autoscaling enabled, the number of workers will scale between 
`minReplicaCount` and `maxReplicaCount` based on the states
+of your tasks. For example, when there is no tasks running in your 
`DolphinScheduler` instance, there will be no workers,
+which will significantly save the resources.
+
+Worker autoscaling feature is compatible with `postgresql` and `mysql` shipped 
with `DolphinScheduler official helm chart`. If you
+use external database, worker autoscaling feature only supports external 
`mysql` databases.

Review Comment:
   No trouble. Will add support for `postgresql` in next commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [dolphinscheduler] EricGao888 commented on a diff in pull request #13367: [Feature][Deployment] Add KEDA autoscaler support for worker deployment when deployed in K8S cluster

Reply via email to