[ 
https://issues.apache.org/jira/browse/FLINK-31203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693865#comment-17693865
 ] 

hjw edited comment on FLINK-31203 at 2/27/23 8:03 AM:
------------------------------------------------------

[Gyula Fora|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=gyfora] 
 Could you help to look at this problem? thx.

 


was (Author: JIRAUSER280998):
@[Gyula 
Fora|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=gyfora] 

 

> Application upgrade rollbacks failed in Flink Kubernetes Operator
> -----------------------------------------------------------------
>
>                 Key: FLINK-31203
>                 URL: https://issues.apache.org/jira/browse/FLINK-31203
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.3.1
>            Reporter: hjw
>            Priority: Major
>
> I make a test on the Application upgrade rollback feature, but this function 
> fails.The Flink application mode job cannot roll back to  last stable spec.
> As shown in the follow example, I declare a error pod-template without a 
> container named flink-main-container to test rollback feature.
> However, only the error of deploying the flink application job failed without 
> rollback.
>  
> Error:
> org.apache.flink.client.deployment.ClusterDeploymentException: Could not 
> create Kubernetes cluster "basic-example".
>  at 
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployClusterInternal(KubernetesClusterDescriptor.java:292)
> Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure 
> executing: POST at: 
> https://*/k8s/clusters/c-fwkxh/apis/apps/v1/namespaces/test-flink/deployments.
>  Message: Deployment.apps "basic-example" is invalid: 
> [spec.template.spec.containers[0].name: Required value, 
> spec.template.spec.containers[0].image: Required value]. Received status: 
> Status(apiVersion=v1, code=422, 
> details=StatusDetails(causes=[StatusCause(field=spec.template.spec.containers[0].name,
>  message=Required value, reason=FieldValueRequired, additionalProperties={}), 
> StatusCause(field=spec.template.spec.containers[0].image, message=Required 
> value, reason=FieldValueRequired, additionalProperties={})], group=apps, 
> kind=Deployment, name=basic-example, retryAfterSeconds=null, uid=null, 
> additionalProperties={}), kind=Status, message=Deployment.apps 
> "basic-example" is invalid: [spec.template.spec.containers[0].name: Required 
> value, spec.template.spec.containers[0].image: Required value], 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), 
> reason=Invalid, status=Failure, additionalProperties={}).
>  at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:673)
>  at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:612)
>  at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:560)
>  
> Env:
> Flink version:Flink 1.16
> Flink Kubernetes Operator:1.3.1
>  
> *Last* ** *stable  spec:*
> apiVersion: [flink.apache.org/v1beta1|http://flink.apache.org/v1beta1]
> kind: FlinkDeployment
> metadata:
>   name: basic-example
> spec:
>   image: flink:1.16
>   flinkVersion: v1_16
>   flinkConfiguration:
>     taskmanager.numberOfTaskSlots: "2"
>     kubernetes.operator.deployment.rollback.enabled: true
>     state.savepoints.dir: s3://flink-data/savepoints
>     state.checkpoints.dir: s3://flink-data/checkpoints
>     high-availability: 
> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
>     high-availability.storageDir: s3://flink-data/ha
>   serviceAccount: flink
>   *podTemplate:*
>     *spec:*
>       *containers:*
>         *- name: flink-main-container*      
>           *env:*
>           *- name: TZ*
>             *value: Asia/Shanghai*
>   jobManager:
>     resource:
>       memory: "2048m"
>       cpu: 1
>   taskManager:
>     resource:
>       memory: "2048m"
>       cpu: 1
>   job:
>     jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar
>     parallelism: 2
>     upgradeMode: stateless
>  
> *new Spec:*
> apiVersion: [flink.apache.org/v1beta1|http://flink.apache.org/v1beta1]
> kind: FlinkDeployment
> metadata:
>   name: basic-example
> spec:
>   image: flink:1.16
>   flinkVersion: v1_16
>   flinkConfiguration:
>     taskmanager.numberOfTaskSlots: "2"
>     kubernetes.operator.deployment.rollback.enabled: true
>     state.savepoints.dir: s3://flink-data/savepoints
>     state.checkpoints.dir: s3://flink-data/checkpoints
>     high-availability: 
> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
>     high-availability.storageDir: s3://flink-data/ha
>   serviceAccount: flink
>   *podTemplate:*
>     *spec:*
>       *containers:*
>         *-   env:*
>           *- name: TZ*
>             *value: Asia/Shanghai*
>   jobManager:
>     resource:
>       memory: "2048m"
>       cpu: 1
>   taskManager:
>     resource:
>       memory: "2048m"
>       cpu: 1
>   job:
>     jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar
>     parallelism: 2
>     upgradeMode: stateless



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to