[
https://issues.apache.org/jira/browse/FLINK-29535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gyula Fora closed FLINK-29535.
------------------------------
Resolution: Duplicate
please reopen it in case the other fix is not working
> Flink Operator Certificate renew issue
> --------------------------------------
>
> Key: FLINK-29535
> URL: https://issues.apache.org/jira/browse/FLINK-29535
> Project: Flink
> Issue Type: Bug
> Components: Kubernetes Operator
> Reporter: Sebastian Struß
> Priority: Major
>
> It seems that there is an issue with the Kubernetes Operator (at least in
> version 1.1.0) when it comes to certificates for the webhook.
> We've seen this error message pop up in the logs:
> | |
> |An exceptionCaught() event was fired, and it reached at the tail of the
> pipeline. It usually means the last handler in the pipeline did not handle
> the exception.|
> |
> and
> javax.net.ssl.SSLHandshakeException: Received fatal alert: bad_certificate at
> sun.security.ssl.Alert.createSSLException(Unknown Source) ~[?:?] at
> sun.security.ssl.Alert.createSSLException(Unknown Source) ~[?:?] at
> sun.security.ssl.TransportContext.fatal(Unknown Source) ~[?:?] at
> sun.security.ssl.Alert$AlertConsumer.consume(Unknown Source) ~[?:?] at
> sun.security.ssl.TransportContext.dispatch(Unknown Source) ~[?:?] at
> sun.security.ssl.SSLTransport.decode(Unknown Source) ~[?:?] at
> sun.security.ssl.SSLEngineImpl.decode(Unknown Source) ~[?:?] at
> sun.security.ssl.SSLEngineImpl.readRecord(Unknown Source) ~[?:?] at
> sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source) ~[?:?] at
> sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source) ~[?:?] at
> javax.net.ssl.SSLEngine.unwrap(Unknown Source) ~[?:?] at
> org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:296)
> ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at
> org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1342)
> ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at
> org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1235)
> ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at
> org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1284)
> ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at
> org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
> ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at
> org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
> ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0]|
> It happens when our fluxcd is trying to update the FlinkDeployment resource.
> This seems to trigger a webhook to an endpoint (in the operator) which is
> serving a (then) invalid certificate.
> We've noticed this after 18 days of it running, so maybe something shortlived
> was not renewed correctly?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)