[ https://issues.apache.org/jira/browse/KAFKA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15223872#comment-15223872 ]
ASF GitHub Bot commented on KAFKA-3488: --------------------------------------- GitHub user rajinisivaram opened a pull request: https://github.com/apache/kafka/pull/1183 KAFKA-3488: Avoid failing of unsent requests in consumer where possible Fail unsent requests only when returning from KafkaConsumer.poll(). You can merge this pull request into a Git repository by running: $ git pull https://github.com/rajinisivaram/kafka KAFKA-3488 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1183.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1183 ---- commit dfad7b0215573800bed56abd3bcc2cf7f6134513 Author: Rajini Sivaram <rajinisiva...@googlemail.com> Date: 2016-04-04T08:38:27Z KAFKA-3488: Avoid failing of unsent requests in consumer where possible ---- > commitAsync() fails if metadata update creates new SASL/SSL connection > ---------------------------------------------------------------------- > > Key: KAFKA-3488 > URL: https://issues.apache.org/jira/browse/KAFKA-3488 > Project: Kafka > Issue Type: Bug > Components: consumer > Affects Versions: 0.9.0.1 > Reporter: Rajini Sivaram > Assignee: Rajini Sivaram > Fix For: 0.10.0.0 > > > Sasl/SslConsumerTest.testSimpleConsumption() fails intermittently with a > failure in {{commitAsync()}}. The exception stack trace shows: > {quote} > kafka.api.SaslPlaintextConsumerTest.testSimpleConsumption FAILED > java.lang.AssertionError: expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at org.junit.Assert.assertEquals(Assert.java:631) > at > kafka.api.BaseConsumerTest.awaitCommitCallback(BaseConsumerTest.scala:340) > at > kafka.api.BaseConsumerTest.testSimpleConsumption(BaseConsumerTest.scala:85) > {quote} > I have recreated this with some additional trace. The tests run with a very > small metadata expiry interval, triggering metadata updates quite often. If a > metadata request immediately following a {{commitAsync()}} call creates a new > SSL/SASL connection, {{ConsumerNetworkClient.poll}} returns to process the > connection handshake packets. Since {{ConsumerNetworkClient.poll}} discards > all unsent packets before returning from poll, this can result in the failure > of the commit - the callback is invoked with {{SendFailedException}}. > I understand that {{ConsumerNetworkClient.poll()}} discards unsent packets > rather than buffer them to keep the code simple. And perhaps it is ok to fail > {{commitAsync}} occasionally since the callback does indicate that the caller > should retry. But it feels like an unnecessary limitation that requires error > handling in client applications when there are no real failures and makes it > much harder to test reliably. As special handling to fix issues like > KAFKA-3412, KAFKA-2672 adds more complexity to the code anyway, and because > it is much harder to debug failures that affect only SSL/SASL, it may be > worth considering improving this behaviour. > I will see if I can submit a PR for the specific issue I was seeing with the > impact of handshakes on {{commitAsync()}}, but I will be interested in views > on improving the logic in {{ConsumerNetworkClient}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)