[
https://issues.apache.org/jira/browse/FLINK-27329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534972#comment-17534972
]
Biao Geng commented on FLINK-27329:
-----------------------------------
Hi [~wangyang0918] [~gyfora], I revisit the default values in our *{*}Spec{*}*s
and summarize them in the bottom table.
IMO, most of them work well with `null` default value, but besides
JobManagerSpec#replicas, there are some fields that I believe we can improve:
# *JobManagerSpec#resource#cpu &* *TaskManagerSpec#resource#cpu:* current
default value is 0 which is not consistent with upstream flink. In my mind,
changing the default value to 1.0 is better for: a) for JM, if users not
specify it explictly, flink will use 1.0; b) for TM, if users not specify it
explictly, flink will use NUM_TASK_SLOTS, whose default value is 1 as well in
flink's default flink-conf.yaml{*}{*}
# *JobSpec#parallelism:* current default value is 0, which is illegal. But I
am not sure if is possible/proper for us to read the value of
parallelism.default in flink-conf.yaml in the *JobSpec* constructor. I tend to
leave it as it is or use `1` as default.{*}{*}
# *FlinkDeploymentSpec#serviceAccount:* current default value is null and as a
result, if we do not specify it, flink will use `default` service account. It
can be problematic as we use `flink` as default in our helm chart's
values.yaml. I am not so sure why we expose it as the first class field, but
maybe `flink` can be a good candidate default value.
| |Default Value in Upstream Flink Native K8s|Current Default Value in K8s
Operator|
|FlinkDeploymentSpec#imagePullPolicy|KubernetesConfigOptions.ImagePullPolicy.IfNotPresent|null|
|FlinkDeploymentSpec#image|KubernetesConfigOptions#getDefaultFlinkImage()|null|
|*{*}FlinkDeploymentSpec#serviceAccount{*}*|"default"|null|
|FlinkDeploymentSpec#flinkVersion|\|null|
|FlinkDeploymentSpec#IngressSpec|\|null|
|FlinkDeploymentSpec#podTemplate|no default value|null|
|*{*}JobManagerSpec#replicas{*}*|1|0|
|JobManagerSpec#resource#memory|1600M(defined in flink-conf.yaml)|null|
|*JobManagerSpec#resource#cpu*|1.0|0|
|JobManagerSpec#podTemplate|no default value|null|
|TaskManagerSpec#resource#memory|memory: 1728m(defined in flink-conf.yaml)|null|
|*TaskManagerSpec#resource#cpu*|NUM_TASK_SLOTS( whose default value is 1 in
flink-conf.yaml)|0|
|TaskManagerSpec#podTemplate|no default value|null|
|JobSpec#jarURI|no default value|null|
|*{*}JobSpec#parallelism{*}*|parallelism.default in flink-conf.yaml|0|
|JobSpec#entryClass|no default value|null|
|JobSpec#args|no default value|String[0]|
|JobSpec#state|\|JobState.RUNNING|
|JobSpec#savepointTriggerNonce|\|null|
|JobSpec#initialSavepointPath|\|null|
|JobSpec#upgradeMode|\|UpgradeMode.STATELESS|
|JobSpec#allowNonRestoredState|\|null|
| | | |
> Add default value of replica of JM pod and not declare it in example yamls
> --------------------------------------------------------------------------
>
> Key: FLINK-27329
> URL: https://issues.apache.org/jira/browse/FLINK-27329
> Project: Flink
> Issue Type: Improvement
> Components: Kubernetes Operator
> Reporter: Biao Geng
> Assignee: Biao Geng
> Priority: Critical
> Fix For: kubernetes-operator-1.0.0
>
>
> Currently, we do not explicitly set the default value of `replica` in
> `JobManagerSpec`. As a result, Java sets the default value to be zero.
> Besides, in our examples, we explicitly declare `replica` in `JobManagerSpec`
> to be 1.
> After a deeper look when debugging the exception thrown in FLINK-27310, we
> find it would be better to set the default value to 1 for the `replica` field
> and remove the declaration in examples due to following reasons:
> 1. A normal Session or Application cluster should have at least one JM. The
> current default value, zero, does not follow the common case.
> 2. One JM can work for k8s HA mode as well and if users really want to launch
> a standby JM for faster recorvery, they can declare the value of `replica`
> field in the yaml file. In examples, we just use the new default value(i.e.
> 1), which should be fine.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)