Hi,

This is Magesh working as a Engineer at Visa INc. I'm relatively new to the
Kafka ecosystem. We are using Kafka 0.9 and during our testing in our test
environments, we have noticed that producer does retries with
NETWORK_EXCEPTION.

To debug the issue, i enabled TRACE logging and noticed that the nodes were
added to the Disconnected list and hence they were being retried.

>From the producer code, I noticed that the following would be the only
scenario where a node is marked disconnected

                    /* cancel any defunct sockets */
                    if (!key.isValid()) {
                        close(channel);
                        this.disconnected.add(channel.id());
                    }
                } catch (Exception e) {
                    String desc = channel.socketDescription();
                    if (e instanceof IOException)
                        log.debug("Connection with {} disconnected", desc,
e);
                    else
                        log.warn("Unexpected error from {}; closing
connection", desc, e);
                    close(channel);
                    this.disconnected.add(channel.id());
                }

Upon careful analysis, I didn't find  any logs related to the exception
block. So, the only possibility is that the sockets were becoming DeFunct.
With netsat, I found that the sockets were getting dropped periodically. I
wasn't sure if it was the Producer, Broker or the network layer thats
causing this. Just wanted to check if there is any recommendation for this.
We are using SASL.

Thanks
Magesh

Reply via email to