[
https://issues.apache.org/jira/browse/KAFKA-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias J. Sax reopened KAFKA-8677:
Reopening this ticket. Test failed again:
[https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/4226/testReport/junit/kafka.api/GroupEndToEndAuthorizationTest/testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl/]
{code:java}
org.apache.kafka.common.protocol.types.SchemaException: Error reading field
'responses': Error reading array of size 131085, only 28 bytes available
at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:110)
at
org.apache.kafka.common.protocol.ApiKeys.parseResponse(ApiKeys.java:313)
at
org.apache.kafka.clients.NetworkClient.parseStructMaybeUpdateThrottleTimeMetrics(NetworkClient.java:719)
at
org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:833)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:556)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:262)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)
at
org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1306)
at
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1246)
at
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1214)
at
kafka.utils.TestUtils$.pollUntilAtLeastNumRecords(TestUtils.scala:795)
at kafka.utils.TestUtils$.consumeRecords(TestUtils.scala:1351)
at
kafka.api.EndToEndAuthorizationTest.consumeRecords(EndToEndAuthorizationTest.scala:537)
at
kafka.api.EndToEndAuthorizationTest.consumeRecordsIgnoreOneAuthorizationException(EndToEndAuthorizationTest.scala:556)
at
kafka.api.EndToEndAuthorizationTest.testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl(EndToEndAuthorizationTest.scala:376)
{code}
> Flakey test
> GroupEndToEndAuthorizationTest#testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl
>
>
> Key: KAFKA-8677
> URL: https://issues.apache.org/jira/browse/KAFKA-8677
> Project: Kafka
> Issue Type: Bug
> Components: core, security, unit tests
>Affects Versions: 2.4.0
>Reporter: Boyang Chen
>Assignee: Guozhang Wang
>Priority: Blocker
> Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/6325/console]
>
> *18:43:39* kafka.api.GroupEndToEndAuthorizationTest >
> testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl STARTED*18:44:00*
> kafka.api.GroupEndToEndAuthorizationTest.testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl
> failed, log available in
> /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/core/build/reports/testOutput/kafka.api.GroupEndToEndAuthorizationTest.testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl.test.stdout*18:44:00*
> *18:44:00* kafka.api.GroupEndToEndAuthorizationTest >
> testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl FAILED*18:44:00*
> org.scalatest.exceptions.TestFailedException: Consumed 0 records before
> timeout instead of the expected 1 records
> ---
> I found this flaky test is actually exposing a real bug in consumer: within
> {{KafkaConsumer.poll}}, we have an optimization to try to send the next fetch
> request before returning the data in order to pipelining the fetch requests:
> {code}
> if (!records.isEmpty()) {
> // before returning the fetched records, we can send off
> the next round of fetches
> // and avoid block waiting for their responses to enable
> pipelining while the user
> // is handling the fetched records.
> //
> // NOTE: since the consumed position has already been
> updated, we must not allow
> // wakeups or any other errors to be triggered prior to
> returning the fetched records.
> if (fetcher.sendFetches() > 0 ||
> client.hasPendingRequests()) {
> client.pollNoWakeup();
> }
> return this.interceptors.onConsume(new
> ConsumerRecords<>(records));
> }
> {code}
> As the NOTE mentioned, this pollNoWakeup should NOT throw any exceptions,
> since at this point the fetch position has been updated. If an exception is
> thrown here, and the callers decides to capture and continue, those records
> would never be returned again, causing data loss.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)