Rodolfo Kohn created KAFKA-13360:
------------------------------------
Summary: Wrong SSL messages when handshake fails
Key: KAFKA-13360
URL: https://issues.apache.org/jira/browse/KAFKA-13360
Project: Kafka
Issue Type: Bug
Components: network
Affects Versions: 2.8.0
Environment: Two VMs, one running one Kafka broker and the other one
running kafka-console-consumer.sh.
The consumer is validating the server certificate.
Both VMs are VirtualBox running in the same laptop.
Using internal LAN.
Latency is in the order of microseconds.
More details in attached PDF.
Reporter: Rodolfo Kohn
Attachments: Kafka error.pdf,
dump_192.168.56.101_192.168.56.102_32776_9093_2021_10_06_21_09_19.pcap,
ssl_kafka_error_logs_match_ssl_logs.txt,
ssl_kafka_error_logs_match_ssl_logs2.txt
When a consumer tries to connect to a Kafka broker and there is an error in the
SSL handshake, like the server sending a certificate that cannot be validated
for not matching the common name with the server/domain name, Kafka sends out
erroneous SSL messages before sending an SSL alert. This error occurs in client
but also can be seen in server.
Because of the nature of the problem it seems it will happen in more if not all
handshake errors.
I've debugged and analyzed the Kafka networking code in
org.apache.kafka.common.network and wrote a detailed description of how the
error occurs.
Attaching the pcap file and a pdf with the detailed description of where the
error is in the networking code (SslTransportLayer, Channel, Selector).
I executed a very basic test between kafka-console-consumer and a simple
installation of one Kafka broker with TLS.
The test consisted on a Kafka broker with a certificate that didn’t match the
domain name I used to identify the server. The CA was well set up to avoid
related problems, like unknown CA error code. Thus, when the server sends the
certificate to the client, the handshake fails with code error 46 (certificate
unknown). The goal was that my tool would detect the issue and send an event,
describing a TLS handshake problem for both processes. However, I noticed the
tool sent what I thought it was the wrong event, it sent a TLS exception event
for an unexpected message instead of an event for TLS alert for certificate
unknown.
I noticed that during handshake, after the client receives Sever Hello,
Certificate, Server Key Exchange, and Server Hello Done, it sends out the same
Client Hello it sent at the beginning and then 3 more records with all zeroes,
in two more messages. It sent a total of 16,709 Bytes including the 289 Bytes
of Client Hello record.
This looks also like a design error regarding how protocol failures are handled.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)