[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825414#comment-17825414 ] Gyula Fora commented on FLINK-31860: I don’t really know how that would be possible but I welcome any recommendation/ solution . > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Assignee: Jayme Howard >Priority: Major > Labels: pull-request-available, stale-assigned > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All content successfully deleted, may be waiting on finalization > reason: ContentDeleted > status: 'False' > type: NamespaceDeletionContentFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some resources are remaining: flinkdeployments.flink.apache.org > has 2 > resource instances' > reason: SomeResourcesRemain > status: 'True' > type: NamespaceContentRemaining > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some content in the namespace has finalizers remaining: > flinkdeployments.flink.apache.org/finalizer > in 2 resource instances' > reason: SomeFinalizersRemain > status: 'True' > type: NamespaceFinalizersRemaining > phase: Terminating {code} > FlinkDeployment example (some fields redacted): > {code:java} > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > creationTimestamp: '2023-03-23T18:27:02Z' > deletionGracePeriodSeconds: 0 > deletionTimestamp: '2023-03-23T18:27:35Z' > finalizers: > - flinkdeployments.flink.apache.org/finalizer > generation: 3 > name: > namespace: > resourceVersion: '10565277081' > uid: e50d2683-6c0c-467e-b10c-fe0f4e404692 > spec: > flinkConfiguration: > taskmanager.numberOfTaskSlots: '2' > flinkVersion: v1_16 > image: > job: > args: [] > entryClass: > jarURI: > parallelism: 2 > state: running > upgradeMode: stateless > jobManager: > replicas: 1 > resource: > cpu: 1 > memory: 2048m > logConfiguration: > log4j-console.properties: '# This affects logging for both user code and > Flink rootLogger.level = INFO rootLogger.appenderRef.console.ref = > ConsoleAppender rootLogger.appenderRef.rolling.ref = RollingFileAppender > # Uncomment this if you want to _only_ change Flink''s logging > #logger.flink.name = org.apache.flink #logger.flink.level = INFO # > The following lines keep the log level of common libraries/connectors on > # log level INFO. The root logger does not override this. You have to > manually # change the log levels here. logger.akka.name = akka > logger.akka.level = INFO logger.kafka.name= org.apache.kafka > logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop > logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper > logger.zookeeper.level = INFO # Log all infos to the console > appender.console.name = ConsoleAppender appender.console.type = CONSOLE > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n # Log all infos in the given rolling file > appender.rolling.name = RollingFileAppender appender.rolling.type = > RollingFile appender.rolling.append = false > appender.rolling.fileName = ${sys:log.file} appender.rolling.filePattern > = ${sys:log.file}.%i appender.rolling.layout.type = PatternLayout > appender.rolling.layout.pattern = %d{-MM-dd
[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825409#comment-17825409 ] Kevin Lam commented on FLINK-31860: --- Thanks [~gyfora]! Is there any consideration for putting a solution in place in the Flink Kubernetes Operator, if there will not be one from the kubernetes / josdk side? > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Assignee: Jayme Howard >Priority: Major > Labels: pull-request-available, stale-assigned > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All content successfully deleted, may be waiting on finalization > reason: ContentDeleted > status: 'False' > type: NamespaceDeletionContentFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some resources are remaining: flinkdeployments.flink.apache.org > has 2 > resource instances' > reason: SomeResourcesRemain > status: 'True' > type: NamespaceContentRemaining > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some content in the namespace has finalizers remaining: > flinkdeployments.flink.apache.org/finalizer > in 2 resource instances' > reason: SomeFinalizersRemain > status: 'True' > type: NamespaceFinalizersRemaining > phase: Terminating {code} > FlinkDeployment example (some fields redacted): > {code:java} > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > creationTimestamp: '2023-03-23T18:27:02Z' > deletionGracePeriodSeconds: 0 > deletionTimestamp: '2023-03-23T18:27:35Z' > finalizers: > - flinkdeployments.flink.apache.org/finalizer > generation: 3 > name: > namespace: > resourceVersion: '10565277081' > uid: e50d2683-6c0c-467e-b10c-fe0f4e404692 > spec: > flinkConfiguration: > taskmanager.numberOfTaskSlots: '2' > flinkVersion: v1_16 > image: > job: > args: [] > entryClass: > jarURI: > parallelism: 2 > state: running > upgradeMode: stateless > jobManager: > replicas: 1 > resource: > cpu: 1 > memory: 2048m > logConfiguration: > log4j-console.properties: '# This affects logging for both user code and > Flink rootLogger.level = INFO rootLogger.appenderRef.console.ref = > ConsoleAppender rootLogger.appenderRef.rolling.ref = RollingFileAppender > # Uncomment this if you want to _only_ change Flink''s logging > #logger.flink.name = org.apache.flink #logger.flink.level = INFO # > The following lines keep the log level of common libraries/connectors on > # log level INFO. The root logger does not override this. You have to > manually # change the log levels here. logger.akka.name = akka > logger.akka.level = INFO logger.kafka.name= org.apache.kafka > logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop > logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper > logger.zookeeper.level = INFO # Log all infos to the console > appender.console.name = ConsoleAppender appender.console.type = CONSOLE > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n # Log all infos in the given rolling file > appender.rolling.name = RollingFileAppender appender.rolling.type = > RollingFile appender.rolling.append = false > appender.rolling.fileName = ${sys:log.file} appender.rolling.filePattern > = ${sys:log.file}.%i appender.rolling.layout.type
[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825379#comment-17825379 ] Gyula Fora commented on FLINK-31860: I am not aware of any solution from the kubernetes / josdk side [~klam-shop] > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Assignee: Jayme Howard >Priority: Major > Labels: pull-request-available, stale-assigned > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All content successfully deleted, may be waiting on finalization > reason: ContentDeleted > status: 'False' > type: NamespaceDeletionContentFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some resources are remaining: flinkdeployments.flink.apache.org > has 2 > resource instances' > reason: SomeResourcesRemain > status: 'True' > type: NamespaceContentRemaining > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some content in the namespace has finalizers remaining: > flinkdeployments.flink.apache.org/finalizer > in 2 resource instances' > reason: SomeFinalizersRemain > status: 'True' > type: NamespaceFinalizersRemaining > phase: Terminating {code} > FlinkDeployment example (some fields redacted): > {code:java} > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > creationTimestamp: '2023-03-23T18:27:02Z' > deletionGracePeriodSeconds: 0 > deletionTimestamp: '2023-03-23T18:27:35Z' > finalizers: > - flinkdeployments.flink.apache.org/finalizer > generation: 3 > name: > namespace: > resourceVersion: '10565277081' > uid: e50d2683-6c0c-467e-b10c-fe0f4e404692 > spec: > flinkConfiguration: > taskmanager.numberOfTaskSlots: '2' > flinkVersion: v1_16 > image: > job: > args: [] > entryClass: > jarURI: > parallelism: 2 > state: running > upgradeMode: stateless > jobManager: > replicas: 1 > resource: > cpu: 1 > memory: 2048m > logConfiguration: > log4j-console.properties: '# This affects logging for both user code and > Flink rootLogger.level = INFO rootLogger.appenderRef.console.ref = > ConsoleAppender rootLogger.appenderRef.rolling.ref = RollingFileAppender > # Uncomment this if you want to _only_ change Flink''s logging > #logger.flink.name = org.apache.flink #logger.flink.level = INFO # > The following lines keep the log level of common libraries/connectors on > # log level INFO. The root logger does not override this. You have to > manually # change the log levels here. logger.akka.name = akka > logger.akka.level = INFO logger.kafka.name= org.apache.kafka > logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop > logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper > logger.zookeeper.level = INFO # Log all infos to the console > appender.console.name = ConsoleAppender appender.console.type = CONSOLE > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n # Log all infos in the given rolling file > appender.rolling.name = RollingFileAppender appender.rolling.type = > RollingFile appender.rolling.append = false > appender.rolling.fileName = ${sys:log.file} appender.rolling.filePattern > = ${sys:log.file}.%i appender.rolling.layout.type = PatternLayout > appender.rolling.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p
[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824083#comment-17824083 ] Kevin Lam commented on FLINK-31860: --- Hi, any updates on this issue? It seems that [https://github.com/kubernetes/kubernetes/issues/115070] was closed without any changes made. > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Assignee: Jayme Howard >Priority: Blocker > Labels: pull-request-available, stale-assigned > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All content successfully deleted, may be waiting on finalization > reason: ContentDeleted > status: 'False' > type: NamespaceDeletionContentFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some resources are remaining: flinkdeployments.flink.apache.org > has 2 > resource instances' > reason: SomeResourcesRemain > status: 'True' > type: NamespaceContentRemaining > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some content in the namespace has finalizers remaining: > flinkdeployments.flink.apache.org/finalizer > in 2 resource instances' > reason: SomeFinalizersRemain > status: 'True' > type: NamespaceFinalizersRemaining > phase: Terminating {code} > FlinkDeployment example (some fields redacted): > {code:java} > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > creationTimestamp: '2023-03-23T18:27:02Z' > deletionGracePeriodSeconds: 0 > deletionTimestamp: '2023-03-23T18:27:35Z' > finalizers: > - flinkdeployments.flink.apache.org/finalizer > generation: 3 > name: > namespace: > resourceVersion: '10565277081' > uid: e50d2683-6c0c-467e-b10c-fe0f4e404692 > spec: > flinkConfiguration: > taskmanager.numberOfTaskSlots: '2' > flinkVersion: v1_16 > image: > job: > args: [] > entryClass: > jarURI: > parallelism: 2 > state: running > upgradeMode: stateless > jobManager: > replicas: 1 > resource: > cpu: 1 > memory: 2048m > logConfiguration: > log4j-console.properties: '# This affects logging for both user code and > Flink rootLogger.level = INFO rootLogger.appenderRef.console.ref = > ConsoleAppender rootLogger.appenderRef.rolling.ref = RollingFileAppender > # Uncomment this if you want to _only_ change Flink''s logging > #logger.flink.name = org.apache.flink #logger.flink.level = INFO # > The following lines keep the log level of common libraries/connectors on > # log level INFO. The root logger does not override this. You have to > manually # change the log levels here. logger.akka.name = akka > logger.akka.level = INFO logger.kafka.name= org.apache.kafka > logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop > logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper > logger.zookeeper.level = INFO # Log all infos to the console > appender.console.name = ConsoleAppender appender.console.type = CONSOLE > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n # Log all infos in the given rolling file > appender.rolling.name = RollingFileAppender appender.rolling.type = > RollingFile appender.rolling.append = false > appender.rolling.fileName = ${sys:log.file} appender.rolling.filePattern > = ${sys:log.file}.%i appender.rolling.layout.type = PatternLayout >
[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758437#comment-17758437 ] Clement Chevalier commented on FLINK-31860: --- Hi Folks, we're also facing this and all our dev process relies on namespace deletion so it's quite annoying for us. I didn't have the opportunity to test the fix proposed by [~rmetzger] , but it sounded like a reasonable workaround, until the [issue|https://github.com/operator-framework/java-operator-sdk/issues/1876] is fixed on java-operator-sdk/kubernetes. [~gyfora] did you test this simple fix with cluster-wide permissions ? > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Assignee: Jayme Howard >Priority: Blocker > Labels: pull-request-available > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All content successfully deleted, may be waiting on finalization > reason: ContentDeleted > status: 'False' > type: NamespaceDeletionContentFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some resources are remaining: flinkdeployments.flink.apache.org > has 2 > resource instances' > reason: SomeResourcesRemain > status: 'True' > type: NamespaceContentRemaining > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some content in the namespace has finalizers remaining: > flinkdeployments.flink.apache.org/finalizer > in 2 resource instances' > reason: SomeFinalizersRemain > status: 'True' > type: NamespaceFinalizersRemaining > phase: Terminating {code} > FlinkDeployment example (some fields redacted): > {code:java} > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > creationTimestamp: '2023-03-23T18:27:02Z' > deletionGracePeriodSeconds: 0 > deletionTimestamp: '2023-03-23T18:27:35Z' > finalizers: > - flinkdeployments.flink.apache.org/finalizer > generation: 3 > name: > namespace: > resourceVersion: '10565277081' > uid: e50d2683-6c0c-467e-b10c-fe0f4e404692 > spec: > flinkConfiguration: > taskmanager.numberOfTaskSlots: '2' > flinkVersion: v1_16 > image: > job: > args: [] > entryClass: > jarURI: > parallelism: 2 > state: running > upgradeMode: stateless > jobManager: > replicas: 1 > resource: > cpu: 1 > memory: 2048m > logConfiguration: > log4j-console.properties: '# This affects logging for both user code and > Flink rootLogger.level = INFO rootLogger.appenderRef.console.ref = > ConsoleAppender rootLogger.appenderRef.rolling.ref = RollingFileAppender > # Uncomment this if you want to _only_ change Flink''s logging > #logger.flink.name = org.apache.flink #logger.flink.level = INFO # > The following lines keep the log level of common libraries/connectors on > # log level INFO. The root logger does not override this. You have to > manually # change the log levels here. logger.akka.name = akka > logger.akka.level = INFO logger.kafka.name= org.apache.kafka > logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop > logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper > logger.zookeeper.level = INFO # Log all infos to the console > appender.console.name = ConsoleAppender appender.console.type = CONSOLE > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n # Log all infos in the given rolling file
[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17720981#comment-17720981 ] Jayme Howard commented on FLINK-31860: -- I *think* that only didn't work because of per-namespace permissions. If cluster-wide permissions are in place, I would expect this to still work? > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Assignee: Jayme Howard >Priority: Blocker > Labels: pull-request-available > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All content successfully deleted, may be waiting on finalization > reason: ContentDeleted > status: 'False' > type: NamespaceDeletionContentFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some resources are remaining: flinkdeployments.flink.apache.org > has 2 > resource instances' > reason: SomeResourcesRemain > status: 'True' > type: NamespaceContentRemaining > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some content in the namespace has finalizers remaining: > flinkdeployments.flink.apache.org/finalizer > in 2 resource instances' > reason: SomeFinalizersRemain > status: 'True' > type: NamespaceFinalizersRemaining > phase: Terminating {code} > FlinkDeployment example (some fields redacted): > {code:java} > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > creationTimestamp: '2023-03-23T18:27:02Z' > deletionGracePeriodSeconds: 0 > deletionTimestamp: '2023-03-23T18:27:35Z' > finalizers: > - flinkdeployments.flink.apache.org/finalizer > generation: 3 > name: > namespace: > resourceVersion: '10565277081' > uid: e50d2683-6c0c-467e-b10c-fe0f4e404692 > spec: > flinkConfiguration: > taskmanager.numberOfTaskSlots: '2' > flinkVersion: v1_16 > image: > job: > args: [] > entryClass: > jarURI: > parallelism: 2 > state: running > upgradeMode: stateless > jobManager: > replicas: 1 > resource: > cpu: 1 > memory: 2048m > logConfiguration: > log4j-console.properties: '# This affects logging for both user code and > Flink rootLogger.level = INFO rootLogger.appenderRef.console.ref = > ConsoleAppender rootLogger.appenderRef.rolling.ref = RollingFileAppender > # Uncomment this if you want to _only_ change Flink''s logging > #logger.flink.name = org.apache.flink #logger.flink.level = INFO # > The following lines keep the log level of common libraries/connectors on > # log level INFO. The root logger does not override this. You have to > manually # change the log levels here. logger.akka.name = akka > logger.akka.level = INFO logger.kafka.name= org.apache.kafka > logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop > logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper > logger.zookeeper.level = INFO # Log all infos to the console > appender.console.name = ConsoleAppender appender.console.type = CONSOLE > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n # Log all infos in the given rolling file > appender.rolling.name = RollingFileAppender appender.rolling.type = > RollingFile appender.rolling.append = false > appender.rolling.fileName = ${sys:log.file} appender.rolling.filePattern > = ${sys:log.file}.%i appender.rolling.layout.type = PatternLayout >
[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17720973#comment-17720973 ] Gyula Fora commented on FLINK-31860: [~isugimpy] we can definitely add this improvement but it didn't fix it for me completely as I was getting an error from the Java Operator SDK itself as it cannot finish deletion of the CR itself (cannot remove the finalizer) > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Assignee: Gyula Fora >Priority: Blocker > Labels: pull-request-available > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All content successfully deleted, may be waiting on finalization > reason: ContentDeleted > status: 'False' > type: NamespaceDeletionContentFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some resources are remaining: flinkdeployments.flink.apache.org > has 2 > resource instances' > reason: SomeResourcesRemain > status: 'True' > type: NamespaceContentRemaining > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some content in the namespace has finalizers remaining: > flinkdeployments.flink.apache.org/finalizer > in 2 resource instances' > reason: SomeFinalizersRemain > status: 'True' > type: NamespaceFinalizersRemaining > phase: Terminating {code} > FlinkDeployment example (some fields redacted): > {code:java} > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > creationTimestamp: '2023-03-23T18:27:02Z' > deletionGracePeriodSeconds: 0 > deletionTimestamp: '2023-03-23T18:27:35Z' > finalizers: > - flinkdeployments.flink.apache.org/finalizer > generation: 3 > name: > namespace: > resourceVersion: '10565277081' > uid: e50d2683-6c0c-467e-b10c-fe0f4e404692 > spec: > flinkConfiguration: > taskmanager.numberOfTaskSlots: '2' > flinkVersion: v1_16 > image: > job: > args: [] > entryClass: > jarURI: > parallelism: 2 > state: running > upgradeMode: stateless > jobManager: > replicas: 1 > resource: > cpu: 1 > memory: 2048m > logConfiguration: > log4j-console.properties: '# This affects logging for both user code and > Flink rootLogger.level = INFO rootLogger.appenderRef.console.ref = > ConsoleAppender rootLogger.appenderRef.rolling.ref = RollingFileAppender > # Uncomment this if you want to _only_ change Flink''s logging > #logger.flink.name = org.apache.flink #logger.flink.level = INFO # > The following lines keep the log level of common libraries/connectors on > # log level INFO. The root logger does not override this. You have to > manually # change the log levels here. logger.akka.name = akka > logger.akka.level = INFO logger.kafka.name= org.apache.kafka > logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop > logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper > logger.zookeeper.level = INFO # Log all infos to the console > appender.console.name = ConsoleAppender appender.console.type = CONSOLE > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n # Log all infos in the given rolling file > appender.rolling.name = RollingFileAppender appender.rolling.type = > RollingFile appender.rolling.append = false > appender.rolling.fileName = ${sys:log.file} appender.rolling.filePattern > =
[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17720654#comment-17720654 ] Jayme Howard commented on FLINK-31860: -- [~rmetzger] That looks awesome! Thank you for sharing it! [~gyfora] The above looks like it'd accomplish a fix for the immediate problem very easily and prevent this from impacting production clusters. If I opened a PR with that, would that be something considered acceptable as a workaround until a more proper fix could be implemented? I'd prefer to not have to maintain an internal fork of the operator when the community as a whole could benefit from this. > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Assignee: Gyula Fora >Priority: Blocker > Labels: pull-request-available > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All content successfully deleted, may be waiting on finalization > reason: ContentDeleted > status: 'False' > type: NamespaceDeletionContentFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some resources are remaining: flinkdeployments.flink.apache.org > has 2 > resource instances' > reason: SomeResourcesRemain > status: 'True' > type: NamespaceContentRemaining > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some content in the namespace has finalizers remaining: > flinkdeployments.flink.apache.org/finalizer > in 2 resource instances' > reason: SomeFinalizersRemain > status: 'True' > type: NamespaceFinalizersRemaining > phase: Terminating {code} > FlinkDeployment example (some fields redacted): > {code:java} > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > creationTimestamp: '2023-03-23T18:27:02Z' > deletionGracePeriodSeconds: 0 > deletionTimestamp: '2023-03-23T18:27:35Z' > finalizers: > - flinkdeployments.flink.apache.org/finalizer > generation: 3 > name: > namespace: > resourceVersion: '10565277081' > uid: e50d2683-6c0c-467e-b10c-fe0f4e404692 > spec: > flinkConfiguration: > taskmanager.numberOfTaskSlots: '2' > flinkVersion: v1_16 > image: > job: > args: [] > entryClass: > jarURI: > parallelism: 2 > state: running > upgradeMode: stateless > jobManager: > replicas: 1 > resource: > cpu: 1 > memory: 2048m > logConfiguration: > log4j-console.properties: '# This affects logging for both user code and > Flink rootLogger.level = INFO rootLogger.appenderRef.console.ref = > ConsoleAppender rootLogger.appenderRef.rolling.ref = RollingFileAppender > # Uncomment this if you want to _only_ change Flink''s logging > #logger.flink.name = org.apache.flink #logger.flink.level = INFO # > The following lines keep the log level of common libraries/connectors on > # log level INFO. The root logger does not override this. You have to > manually # change the log levels here. logger.akka.name = akka > logger.akka.level = INFO logger.kafka.name= org.apache.kafka > logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop > logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper > logger.zookeeper.level = INFO # Log all infos to the console > appender.console.name = ConsoleAppender appender.console.type = CONSOLE > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n # Log all infos in the given
[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719694#comment-17719694 ] Robert Metzger commented on FLINK-31860: We were also facing this problem, and we've solved it for now in this hacky way: {code} --- .../kubernetes/operator/utils/EventUtils.java | 24 ++- .../templates/rbac.yaml | 1 + 2 files changed, 19 insertions(+), 6 deletions(-) diff --git a/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/utils/EventUtils.java b/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/utils/EventUtils.java index d993de2..36c49a4 100644 --- a/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/utils/EventUtils.java +++ b/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/utils/EventUtils.java @@ -22,6 +22,8 @@ import io.fabric8.kubernetes.api.model.HasMetadata; import io.fabric8.kubernetes.api.model.ObjectReferenceBuilder; import io.fabric8.kubernetes.client.KubernetesClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; import java.time.Instant; import java.util.function.Consumer; @@ -31,6 +33,7 @@ * https://github.com/EnMasseProject/enmasse/blob/master/k8s-api/src/main/java/io/enmasse/k8s/api/KubeEventLogger.java */ public class EventUtils { +private static final Logger LOG = LoggerFactory.getLogger(EventUtils.class); public static String generateEventName( HasMetadata target, @@ -58,14 +61,14 @@ public static boolean createOrUpdateEvent( String message, EventRecorder.Component component, Consumer eventListener) { +var namespace = target.getMetadata().getNamespace(); +if (isNamespaceMarkedForDeletion(client, namespace)) { +LOG.info("Ignoring event because namespace is marked for deletion"); +return true; +} var eventName = generateEventName(target, type, reason, message, component); -var existing = -client.v1() -.events() -.inNamespace(target.getMetadata().getNamespace()) -.withName(eventName) -.get(); +var existing = client.v1().events().inNamespace(namespace).withName(eventName).get(); if (existing != null && existing.getType().equals(type.name()) @@ -109,4 +112,13 @@ public static boolean createOrUpdateEvent( return true; } } + +private static boolean isNamespaceMarkedForDeletion(KubernetesClient client, String namespace) { +try { +return client.namespaces().withName(namespace).get().isMarkedForDeletion(); +} catch (Exception e) { +LOG.warn("Error while checking namespace status", e); +return false; +} +} } diff --git a/helm/flink-kubernetes-operator/templates/rbac.yaml b/helm/flink-kubernetes-operator/templates/rbac.yaml index f50852e..21d7071 100644 --- a/helm/flink-kubernetes-operator/templates/rbac.yaml +++ b/helm/flink-kubernetes-operator/templates/rbac.yaml @@ -29,6 +29,7 @@ rules: - events - configmaps - secrets + - namespaces verbs: - "*" {{- if .Values.rbac.nodesRule.create }} {code} > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Assignee: Gyula Fora >Priority: Blocker > Labels: pull-request-available > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime:
[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17717055#comment-17717055 ] Gyula Fora commented on FLINK-31860: I have to put this work on hold for now because any effort on the flink operator side cannot fix a bigger underlying problem. We are investigating if there is a workaround in kubernetes itself somehow. More details here: [java-operator-sdk/java-operator-sdk#1876|https://github.com/java-operator-sdk/java-operator-sdk/issues/1876] > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Assignee: Gyula Fora >Priority: Blocker > Labels: pull-request-available > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All content successfully deleted, may be waiting on finalization > reason: ContentDeleted > status: 'False' > type: NamespaceDeletionContentFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some resources are remaining: flinkdeployments.flink.apache.org > has 2 > resource instances' > reason: SomeResourcesRemain > status: 'True' > type: NamespaceContentRemaining > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some content in the namespace has finalizers remaining: > flinkdeployments.flink.apache.org/finalizer > in 2 resource instances' > reason: SomeFinalizersRemain > status: 'True' > type: NamespaceFinalizersRemaining > phase: Terminating {code} > FlinkDeployment example (some fields redacted): > {code:java} > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > creationTimestamp: '2023-03-23T18:27:02Z' > deletionGracePeriodSeconds: 0 > deletionTimestamp: '2023-03-23T18:27:35Z' > finalizers: > - flinkdeployments.flink.apache.org/finalizer > generation: 3 > name: > namespace: > resourceVersion: '10565277081' > uid: e50d2683-6c0c-467e-b10c-fe0f4e404692 > spec: > flinkConfiguration: > taskmanager.numberOfTaskSlots: '2' > flinkVersion: v1_16 > image: > job: > args: [] > entryClass: > jarURI: > parallelism: 2 > state: running > upgradeMode: stateless > jobManager: > replicas: 1 > resource: > cpu: 1 > memory: 2048m > logConfiguration: > log4j-console.properties: '# This affects logging for both user code and > Flink rootLogger.level = INFO rootLogger.appenderRef.console.ref = > ConsoleAppender rootLogger.appenderRef.rolling.ref = RollingFileAppender > # Uncomment this if you want to _only_ change Flink''s logging > #logger.flink.name = org.apache.flink #logger.flink.level = INFO # > The following lines keep the log level of common libraries/connectors on > # log level INFO. The root logger does not override this. You have to > manually # change the log levels here. logger.akka.name = akka > logger.akka.level = INFO logger.kafka.name= org.apache.kafka > logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop > logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper > logger.zookeeper.level = INFO # Log all infos to the console > appender.console.name = ConsoleAppender appender.console.type = CONSOLE > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n # Log all infos in the given rolling file > appender.rolling.name = RollingFileAppender appender.rolling.type = > RollingFile appender.rolling.append =
[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17716832#comment-17716832 ] Jayme Howard commented on FLINK-31860: -- This looks very reasonable to me as written, greatly appreciate being tagged in on it! I'll see if I can get a test of it done, but may not be able to reasonably do so until next week when I'm back to work. > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Assignee: Gyula Fora >Priority: Blocker > Labels: pull-request-available > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All content successfully deleted, may be waiting on finalization > reason: ContentDeleted > status: 'False' > type: NamespaceDeletionContentFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some resources are remaining: flinkdeployments.flink.apache.org > has 2 > resource instances' > reason: SomeResourcesRemain > status: 'True' > type: NamespaceContentRemaining > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some content in the namespace has finalizers remaining: > flinkdeployments.flink.apache.org/finalizer > in 2 resource instances' > reason: SomeFinalizersRemain > status: 'True' > type: NamespaceFinalizersRemaining > phase: Terminating {code} > FlinkDeployment example (some fields redacted): > {code:java} > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > creationTimestamp: '2023-03-23T18:27:02Z' > deletionGracePeriodSeconds: 0 > deletionTimestamp: '2023-03-23T18:27:35Z' > finalizers: > - flinkdeployments.flink.apache.org/finalizer > generation: 3 > name: > namespace: > resourceVersion: '10565277081' > uid: e50d2683-6c0c-467e-b10c-fe0f4e404692 > spec: > flinkConfiguration: > taskmanager.numberOfTaskSlots: '2' > flinkVersion: v1_16 > image: > job: > args: [] > entryClass: > jarURI: > parallelism: 2 > state: running > upgradeMode: stateless > jobManager: > replicas: 1 > resource: > cpu: 1 > memory: 2048m > logConfiguration: > log4j-console.properties: '# This affects logging for both user code and > Flink rootLogger.level = INFO rootLogger.appenderRef.console.ref = > ConsoleAppender rootLogger.appenderRef.rolling.ref = RollingFileAppender > # Uncomment this if you want to _only_ change Flink''s logging > #logger.flink.name = org.apache.flink #logger.flink.level = INFO # > The following lines keep the log level of common libraries/connectors on > # log level INFO. The root logger does not override this. You have to > manually # change the log levels here. logger.akka.name = akka > logger.akka.level = INFO logger.kafka.name= org.apache.kafka > logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop > logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper > logger.zookeeper.level = INFO # Log all infos to the console > appender.console.name = ConsoleAppender appender.console.type = CONSOLE > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n # Log all infos in the given rolling file > appender.rolling.name = RollingFileAppender appender.rolling.type = > RollingFile appender.rolling.append = false > appender.rolling.fileName = ${sys:log.file} appender.rolling.filePattern > = ${sys:log.file}.%i
[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17716805#comment-17716805 ] Gyula Fora commented on FLINK-31860: [~isugimpy] , the PR is ready, please help review/test this. > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Assignee: Gyula Fora >Priority: Blocker > Labels: pull-request-available > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All content successfully deleted, may be waiting on finalization > reason: ContentDeleted > status: 'False' > type: NamespaceDeletionContentFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some resources are remaining: flinkdeployments.flink.apache.org > has 2 > resource instances' > reason: SomeResourcesRemain > status: 'True' > type: NamespaceContentRemaining > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some content in the namespace has finalizers remaining: > flinkdeployments.flink.apache.org/finalizer > in 2 resource instances' > reason: SomeFinalizersRemain > status: 'True' > type: NamespaceFinalizersRemaining > phase: Terminating {code} > FlinkDeployment example (some fields redacted): > {code:java} > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > creationTimestamp: '2023-03-23T18:27:02Z' > deletionGracePeriodSeconds: 0 > deletionTimestamp: '2023-03-23T18:27:35Z' > finalizers: > - flinkdeployments.flink.apache.org/finalizer > generation: 3 > name: > namespace: > resourceVersion: '10565277081' > uid: e50d2683-6c0c-467e-b10c-fe0f4e404692 > spec: > flinkConfiguration: > taskmanager.numberOfTaskSlots: '2' > flinkVersion: v1_16 > image: > job: > args: [] > entryClass: > jarURI: > parallelism: 2 > state: running > upgradeMode: stateless > jobManager: > replicas: 1 > resource: > cpu: 1 > memory: 2048m > logConfiguration: > log4j-console.properties: '# This affects logging for both user code and > Flink rootLogger.level = INFO rootLogger.appenderRef.console.ref = > ConsoleAppender rootLogger.appenderRef.rolling.ref = RollingFileAppender > # Uncomment this if you want to _only_ change Flink''s logging > #logger.flink.name = org.apache.flink #logger.flink.level = INFO # > The following lines keep the log level of common libraries/connectors on > # log level INFO. The root logger does not override this. You have to > manually # change the log levels here. logger.akka.name = akka > logger.akka.level = INFO logger.kafka.name= org.apache.kafka > logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop > logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper > logger.zookeeper.level = INFO # Log all infos to the console > appender.console.name = ConsoleAppender appender.console.type = CONSOLE > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n # Log all infos in the given rolling file > appender.rolling.name = RollingFileAppender appender.rolling.type = > RollingFile appender.rolling.append = false > appender.rolling.fileName = ${sys:log.file} appender.rolling.filePattern > = ${sys:log.file}.%i appender.rolling.layout.type = PatternLayout > appender.rolling.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n
[jira] [Commented] (FLINK-31860) FlinkDeployments never finalize when namespace is deleted
[ https://issues.apache.org/jira/browse/FLINK-31860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17714393#comment-17714393 ] Gyula Fora commented on FLINK-31860: Thank you for the detailed analysis , event triggering seems to be the culprit here , you are completely right. We should avoid triggering events or sending status updates in the cleanup step. We can work on the fix and I will cc you for the review so you can get some familiarity with the code:) > FlinkDeployments never finalize when namespace is deleted > - > > Key: FLINK-31860 > URL: https://issues.apache.org/jira/browse/FLINK-31860 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.1 > Environment: Apache Flink Kubernetes Operator 1.3.1 > Kubernetes 1.24.9 >Reporter: Jayme Howard >Priority: Minor > > This appears to be a pretty straightforward issue, but I don't know the > codebase well enough to propose a fix. When a FlinkDeployment is present in > a namespace, and the namespace is deleted, the FlinkDeployment never > reconciles and fails to complete its finalizer. This leads to the namespace > being blocked from deletion indefinitely, requiring manual manipulation to > remove the finalizer on the FlinkDeployment. > > Namespace conditions: > {code:java} > conditions: > - lastTransitionTime: '2023-04-18T22:17:48Z' > message: All resources successfully discovered > reason: ResourcesDiscovered > status: 'False' > type: NamespaceDeletionDiscoveryFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All legacy kube types successfully parsed > reason: ParsedGroupVersions > status: 'False' > type: NamespaceDeletionGroupVersionParsingFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: All content successfully deleted, may be waiting on finalization > reason: ContentDeleted > status: 'False' > type: NamespaceDeletionContentFailure > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some resources are remaining: flinkdeployments.flink.apache.org > has 2 > resource instances' > reason: SomeResourcesRemain > status: 'True' > type: NamespaceContentRemaining > - lastTransitionTime: '2023-03-23T18:27:37Z' > message: 'Some content in the namespace has finalizers remaining: > flinkdeployments.flink.apache.org/finalizer > in 2 resource instances' > reason: SomeFinalizersRemain > status: 'True' > type: NamespaceFinalizersRemaining > phase: Terminating {code} > FlinkDeployment example (some fields redacted): > {code:java} > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > creationTimestamp: '2023-03-23T18:27:02Z' > deletionGracePeriodSeconds: 0 > deletionTimestamp: '2023-03-23T18:27:35Z' > finalizers: > - flinkdeployments.flink.apache.org/finalizer > generation: 3 > name: > namespace: > resourceVersion: '10565277081' > uid: e50d2683-6c0c-467e-b10c-fe0f4e404692 > spec: > flinkConfiguration: > taskmanager.numberOfTaskSlots: '2' > flinkVersion: v1_16 > image: > job: > args: [] > entryClass: > jarURI: > parallelism: 2 > state: running > upgradeMode: stateless > jobManager: > replicas: 1 > resource: > cpu: 1 > memory: 2048m > logConfiguration: > log4j-console.properties: '# This affects logging for both user code and > Flink rootLogger.level = INFO rootLogger.appenderRef.console.ref = > ConsoleAppender rootLogger.appenderRef.rolling.ref = RollingFileAppender > # Uncomment this if you want to _only_ change Flink''s logging > #logger.flink.name = org.apache.flink #logger.flink.level = INFO # > The following lines keep the log level of common libraries/connectors on > # log level INFO. The root logger does not override this. You have to > manually # change the log levels here. logger.akka.name = akka > logger.akka.level = INFO logger.kafka.name= org.apache.kafka > logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop > logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper > logger.zookeeper.level = INFO # Log all infos to the console > appender.console.name = ConsoleAppender appender.console.type = CONSOLE > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n # Log all infos in the given rolling file > appender.rolling.name = RollingFileAppender appender.rolling.type = > RollingFile appender.rolling.append = false > appender.rolling.fileName = ${sys:log.file} appender.rolling.filePattern > = ${sys:log.file}.%i