[
https://issues.apache.org/jira/browse/KAFKA-16305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chia-Ping Tsai reopened KAFKA-16305:
------------------------------------
reopen for backport to 3.7
> Optimisation in SslTransportLayer:handshakeUnwrap stalls TLS handshake
> ----------------------------------------------------------------------
>
> Key: KAFKA-16305
> URL: https://issues.apache.org/jira/browse/KAFKA-16305
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 3.7.0, 3.6.1
> Reporter: Gaurav Narula
> Assignee: Gaurav Narula
> Priority: Major
> Fix For: 3.8.0
>
>
> Kafka allows users to configure custom {{SSLEngine}} via the
> {{SslEngineFactory}} interface. There have been attempts to use an OpenSSL
> based {{SSLEngine}} using {{netty-handler}} over the JDK based implementation
> for performance reasons.
> While trying to use a Netty/Openssl based SSLEngine, we observe that the
> server hangs while performing the TLS handshake. We observe the following
> logs
> {code}
> 2024-02-26 01:40:00,117
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 state NOT_INITIALIZED
> 2024-02-26 01:40:00,117
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [INFO ]
> Selector - [SocketServer listenerType=ZK_BROKER, nodeId=101]
> XX-Gaurav: calling prepare channelId 127.0.0.1:60045-127.0.0.1:60046-0
> 2024-02-26 01:40:00,117
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 state NOT_INITIALIZED
> 2024-02-26 01:40:00,117
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
> 2024-02-26 01:40:00,117
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] SSLHandshake NEED_UNWRAP channelId
> 127.0.0.1:60045-127.0.0.1:60046-0, appReadBuffer pos 0, netReadBuffer pos 0,
> netWriteBuffer pos 0
> 2024-02-26 01:40:00,117
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] SSLHandshake handshakeUnwrap channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 doRead true
> 2024-02-26 01:40:00,118
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] SSLHandshake post handshakeUnwrap: channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 handshakeStatus NEED_UNWRAP status
> BUFFER_UNDERFLOW read 0
> 2024-02-26 01:40:00,118
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] SSLHandshake post unwrap NEED_UNWRAP channelId
> 127.0.0.1:60045-127.0.0.1:60046-0, handshakeResult Status = BUFFER_UNDERFLOW
> HandshakeStatus = NEED_UNWRAP bytesConsumed = 0 bytesProduced = 0,
> appReadBuffer pos 0, netReadBuffer pos 0, netWriteBuffer pos 0
> handshakeStatus NEED_UNWRAP
> 2024-02-26 01:40:00,118
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
> 2024-02-26 01:40:00,118
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
> 2024-02-26 01:40:00,118
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
> 2024-02-26 01:40:00,118
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
> 2024-02-26 01:40:00,118
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
> 2024-02-26 01:40:00,119
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
> 2024-02-26 01:40:00,119
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
> 2024-02-26 01:40:00,120
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [INFO ]
> Selector - [SocketServer listenerType=ZK_BROKER, nodeId=101]
> XX-Gaurav: calling prepare channelId 127.0.0.1:60045-127.0.0.1:60046-0
> 2024-02-26 01:40:00,120
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
> 2024-02-26 01:40:00,120
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
> 2024-02-26 01:40:00,120
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] SSLHandshake NEED_UNWRAP channelId
> 127.0.0.1:60045-127.0.0.1:60046-0, appReadBuffer pos 0, netReadBuffer pos
> 383, netWriteBuffer pos 0
> 2024-02-26 01:40:00,120
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] SSLHandshake handshakeUnwrap channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 doRead true
> 2024-02-26 01:40:00,122
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] SSLHandshake post handshakeUnwrap: channelId
> 127.0.0.1:60045-127.0.0.1:60046-0 handshakeStatus NEED_UNWRAP status OK read 0
> 2024-02-26 01:40:00,122
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE]
> SslTransportLayer - [SslTransportLayer
> channelId=127.0.0.1:60045-127.0.0.1:60046-0
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be,
> interestOps=1, readyOps=0] SSLHandshake post unwrap NEED_UNWRAP channelId
> 127.0.0.1:60045-127.0.0.1:60046-0, handshakeResult Status = OK
> HandshakeStatus = NEED_TASK bytesConsumed = 383 bytesProduced = 0,
> appReadBuffer pos 0, netReadBuffer pos 0, netWriteBuffer pos 0
> handshakeStatus NEED_UNWRAP
> {code}
> Lines 21 and 22 are of interest here as we observe that:
> 1. {{handshakeStatus}} is {{NEED_UNWRAP}} and
> 2. {{netReadBuffer.position() == 0}}
> The do-while loop in {{handshakeUnwrap()}} is
> [exited|https://github.com/apache/kafka/blob/9bc9fae9425e4dac64ef078cd3a4e7e6e09cc45a/clients/src/main/java/org/apache/kafka/common/network/SslTransportLayer.java#L529]
> because of (2) and as a result, the {{SSLEngine}}'s state machine isn't
> handled properly.
> We should therefore, avoid the optimisation and simplify the condition to be
> just {{while(cont)}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)