[
https://issues.apache.org/jira/browse/FLINK-36370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904020#comment-17904020
]
Aniruddh J commented on FLINK-36370:
------------------------------------
[~gyfora] can you please share some insight on the same? Thanks!
> Flink 1.18 fails with Empty server certificate chain when High Availability
> and mTLS both enabled
> -------------------------------------------------------------------------------------------------
>
> Key: FLINK-36370
> URL: https://issues.apache.org/jira/browse/FLINK-36370
> Project: Flink
> Issue Type: Bug
> Components: Kubernetes Operator, Runtime / Coordination
> Affects Versions: kubernetes-operator-1.7.0, 1.18.1
> Reporter: Aniruddh J
> Priority: Major
> Attachments: Screenshot 2024-10-03 at 11.42.52 AM.png, Screenshot
> 2024-10-14 at 10.41.00 AM.png, flink-cert-issue.log,
> flink-kubernetes-operator-54b9b99bd5-hkh8q-flink-kubernetes-operator.log,
> flink-ssl-66c8dfbcc7-l725q-flink-main-container.log
>
>
> Hi, in my kubernetes cluster I have flink-kubernetes-operator v1.7.0 and
> apache-flink v1.18.1 installed. In the FlinkDeployment CR when I enable
> Kubernetes high availability services with mTLS something like below:
> {code:java}
> high-availability.type: kubernetes
> high-availability:
> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
> high-availability.storageDir: 'file:///mnt/pv/ha'
> security.ssl.rest.authentication-enabled: 'true'{code}
> I am ending up with *SSLHandshakeException with empty client certificate*
>
> Though both of them work fine when implemented individually. Upon enabling
> *{{{}-{}}}{{{}[Djavax.net|http://djavax.net/]{}}}{{{}.debug=all{}}}* observed
> client server communication and figured out
> [https://github.com/apache/flink/blob/release-1.18/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestClient.java]
> is where Client gets setup and it happens from the operator side
> [https://github.com/apache/flink-kubernetes-operator/blob/b081b75b72ddde643710e869b95b214912882363/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java#L750]
> (correct me here please)
>
> When we enable both mTLS and HA the client doesn't seem to be getting setup.
> Not only that, it doesn't follow the same path of client creation. Below is
> the part of the ssl handshake log before getting the error (attached the
> entire ssl handshake log):
> {code:java}
> javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19
> 15:16:12.508 GMT|null:-1|Produced CertificateRequest handshake message (
> "CertificateRequest":
> { "certificate types": [ecdsa_sign, rsa_sign, dss_sign] "supported signature
> algorithms": [ecdsa_secp256r1_sha256, .., rsa_sha224, dsa_sha224, ecdsa_sha1,
> rsa_pkcs1_sha1, dsa_sha1] "certificate authorities": [CN=FlinkCA, O=Apache
> Flink] }
> javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19
> 15:16:12.512 GMT|null:-1|Raw read (
> 0000: 1603030007 0B 000003000000 ............
> )
> javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19
> 15:16:12.513 GMT|null:-1|READ: TLSv1.2 handshake, length = 7
> javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19
> 15:16:12.513 GMT|null:-1|Consuming client Certificate handshake message (
> "Certificates": <empty list>
> )
> javax.net.ssl|ERROR|53|flink-rest-server-netty-worker-thread-1|2024-09-19
> 15:16:12.514 GMT|null:-1|Fatal (BAD_CERTIFICATE): Empty server certificate
> chain (
> "throwable" : {
> javax.net.ssl.SSLHandshakeException: Empty server certificate chain
> {code}
> From the initial looks it seems when Flink server is requesting for
> certificates from Client, the client doesn't send anything back since it does
> not have certificates matching the CA?
>
> Some client is sending a REST request to Flink server which the netty library
> is handling but until we figure out the client we don't know whether it's the
> truststore on client that's a problem or something else we don't see here.
>
> *Note: The certficates for Flink are self-signed certificates.*
> Thanks!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)