GitHub user radai-rosenblatt opened a pull request:

    https://github.com/apache/kafka/pull/4223

    when closing a socket in response to an IOException, also print the root 
issue if closing fails

    `Selector.pollSelectionKeys()` attempts to close the channel in response to 
an Exception (lets call this exception the root issue).
    
    if the root issue itself is an IOException, its printed to log at debug 
level (which is usually off for production users):
    
    ```java
     catch (Exception e) {
        String desc = channel.socketDescription();
        if (e instanceof IOException)
           log.debug("Connection with {} disconnected", desc, e);   <----- does 
not appear in real-life log
        else if (e instanceof AuthenticationException) // will be logged later 
as error by clients
           log.debug("Connection with {} disconnected due to authentication 
exception", desc, e);
        else
           log.warn("Unexpected error from {}; closing connection", desc, e);
        close(channel, true); 
    }
    ```
    
    for some cases, close itself would throw an exception. this exception is 
printed to log as a warning (`Selector.doClose()`):
    ```java
    try {
       channel.close();
    } catch (IOException e) {
       log.error("Exception closing connection to node {}:", channel.id(), e);
    }
    ```
    
    this tends to actually show up in user log, looking something like this 
(note - line numbers are from kafka 10.2.*):
    ```
    java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:65)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470)
        at 
org.apache.kafka.common.network.SslTransportLayer.flush(SslTransportLayer.java:195)
        at 
org.apache.kafka.common.network.SslTransportLayer.close(SslTransportLayer.java:163)
        at org.apache.kafka.common.utils.Utils.closeAll(Utils.java:731)
        at 
org.apache.kafka.common.network.KafkaChannel.close(KafkaChannel.java:54)
        at org.apache.kafka.common.network.Selector.doClose(Selector.java:540)
        at org.apache.kafka.common.network.Selector.close(Selector.java:531)
        at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:378)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:303)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:349)
        at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:233)
        at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:131)
        at java.lang.Thread.run(Thread.java:745)
    ```
    
    this issue spams user's logs and is not really helpful in diagnosing the 
real underlying cause, which (after turning debug logs on) turned out to be 
(for this particular case):
    ```
    javax.net.ssl.SSLHandshakeException: General SSLEngine problem
        at sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1409)
        at 
sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:535)
        at 
sun.security.ssl.SSLEngineImpl.writeAppRecord(SSLEngineImpl.java:1214)
        at sun.security.ssl.SSLEngineImpl.wrap(SSLEngineImpl.java:1186)
        at javax.net.ssl.SSLEngine.wrap(SSLEngine.java:469)
        at 
org.apache.kafka.common.network.SslTransportLayer.handshakeWrap(SslTransportLayer.java:382)
        at 
org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:243)
        at 
org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:69)
        at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:350)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:303)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:349)
        at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:233)
        at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:131)
        at java.lang.Thread.run(Thread.java:745)
    Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
        at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
        at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
        at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:304)
        at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296)
        at 
sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1478)
        at 
sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:212)
        at sun.security.ssl.Handshaker.processLoop(Handshaker.java:957)
        at sun.security.ssl.Handshaker$1.run(Handshaker.java:897)
        at sun.security.ssl.Handshaker$1.run(Handshaker.java:894)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1347)
        at 
org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:336)
        at 
org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:417)
        at 
org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:270)
        ... 7 more
    Caused by: sun.security.validator.ValidatorException: No trusted 
certificate found
        at 
sun.security.validator.SimpleValidator.buildTrustedChain(SimpleValidator.java:384)
        at 
sun.security.validator.SimpleValidator.engineValidate(SimpleValidator.java:133)
        at sun.security.validator.Validator.validate(Validator.java:260)
        at 
sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324)
        at 
sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:281)
        at 
sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136)
        at 
sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1465)
        ... 16 more
    ```
    this patch allows submitting the root cause exception (the one caught in 
pollSelectionKeys) to close so that in the event close encounters an exception 
the root cause is also printed alongside the close exception.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/radai-rosenblatt/kafka better-logging

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/4223.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4223
    
----
commit d05e4fda9171f8cf0b73d175c05dc15ed99deba4
Author: radai-rosenblatt <radai.rosenbl...@gmail.com>
Date:   2017-11-15T23:12:42Z

    when closing a socket in response to an IOException, also print the root 
issue if closing fails

----


---

Reply via email to