[
https://issues.apache.org/jira/browse/FLINK-21706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300066#comment-17300066
]
Dong Lin edited comment on FLINK-21706 at 3/12/21, 6:11 AM:
------------------------------------------------------------
I have investigated this bug. The bug is not currently reproducible as this
test passes across 50 runs on my laptop. And the test failed just once in the
recent 1 month according to the Azure test history.
After looking into the Kafka source code and bugs reports, I didn't find any
good explanation or solutions to this issue.
I suspect it could be due to ephemeral network failure (e.g. the connection
fails and packet is truncated). Given the low risk of this bug (since it
happens very rarely) and the difficulty of verifying/fixing this issue (because
it is not easily reproducible), I suggest we close this bug as not reproducible
later, if it is does not happen in the Azure test pipeline after 1 week.
was (Author: lindong):
I have investigated this bug. The bug is not currently reproducible as this
test passes across 50 runs on my laptop. And the test failed just once in the
recent 1 month.
After looking into the Kafka source code and bugs reports, I didn't find any
good explanation or solutions to this issue.
I suspect it could be due to ephemeral network failure (e.g. the connection
fails and packet is truncated). Given the low risk of this bug (since it
happens very rarely) and the difficulty of verifying/fixing this issue (because
it is not easily reproducible), I suggest we close this bug as not reproducible
after 1 week.
> FlinkKafkaProducerITCase.testMigrateFromAtExactlyOnceToAtLeastOnce fails with
> "SchemaException: Error reading field 'api_keys': Error reading field
> 'api_key': java.nio.BufferUnderflowException"
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-21706
> URL: https://issues.apache.org/jira/browse/FLINK-21706
> Project: Flink
> Issue Type: Bug
> Components: Connectors / Kafka
> Affects Versions: 1.12.2
> Reporter: Dawid Wysakowicz
> Assignee: Dong Lin
> Priority: Major
> Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=14364&view=logs&j=72d4811f-9f0d-5fd0-014a-0bc26b72b642&t=c1d93a6a-ba91-515d-3196-2ee8019fbda7
> {code}
> [ERROR]
> testMigrateFromAtExactlyOnceToAtLeastOnce(org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase)
> Time elapsed: 2.013 s <<< ERROR!
> org.apache.kafka.common.protocol.types.SchemaException: Error reading field
> 'api_keys': Error reading field 'api_key': java.nio.BufferUnderflowException
> at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:110)
> at
> org.apache.kafka.common.protocol.ApiKeys.parseResponse(ApiKeys.java:324)
> at
> org.apache.kafka.common.protocol.ApiKeys$1.parseResponse(ApiKeys.java:162)
> at
> org.apache.kafka.clients.NetworkClient.parseStructMaybeUpdateThrottleTimeMetrics(NetworkClient.java:719)
> at
> org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:833)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:556)
> at
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:262)
> at
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)
> at
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
> at
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:161)
> at
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:484)
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1267)
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1235)
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1168)
> at
> org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironmentImpl.getAllRecordsFromTopic(KafkaTestEnvironmentImpl.java:274)
> at
> org.apache.flink.streaming.connectors.kafka.KafkaTestBase.assertExactlyOnceForTopic(KafkaTestBase.java:333)
> at
> org.apache.flink.streaming.connectors.kafka.KafkaTestBase.assertExactlyOnceForTopic(KafkaTestBase.java:303)
> at
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase.testMigrateFromAtExactlyOnceToAtLeastOnce(FlinkKafkaProducerITCase.java:598)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> at
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> at
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> at
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> at
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> at
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> at
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> at
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)