[ 
https://issues.apache.org/jira/browse/FLINK-21706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300066#comment-17300066
 ] 

Dong Lin edited comment on FLINK-21706 at 3/12/21, 6:11 AM:
------------------------------------------------------------

I have investigated this bug. The bug is not currently reproducible as this 
test passes across 50 runs on my laptop. And the test failed just once in the 
recent 1 month according to the Azure test history.

After looking into the Kafka source code and bugs reports, I didn't find any 
good explanation or solutions to this issue.

I suspect it could be due to ephemeral network failure (e.g. the connection 
fails and packet is truncated).  Given the low risk of this bug (since it 
happens very rarely) and the difficulty of verifying/fixing this issue (because 
it is not easily reproducible), I suggest we close this bug as not reproducible 
later, if it is does not happen in the Azure test pipeline after 1 week.


was (Author: lindong):
I have investigated this bug. The bug is not currently reproducible as this 
test passes across 50 runs on my laptop. And the test failed just once in the 
recent 1 month.

After looking into the Kafka source code and bugs reports, I didn't find any 
good explanation or solutions to this issue.

I suspect it could be due to ephemeral network failure (e.g. the connection 
fails and packet is truncated).  Given the low risk of this bug (since it 
happens very rarely) and the difficulty of verifying/fixing this issue (because 
it is not easily reproducible), I suggest we close this bug as not reproducible 
after 1 week.

> FlinkKafkaProducerITCase.testMigrateFromAtExactlyOnceToAtLeastOnce fails with 
> "SchemaException: Error reading field 'api_keys': Error reading field 
> 'api_key': java.nio.BufferUnderflowException"
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-21706
>                 URL: https://issues.apache.org/jira/browse/FLINK-21706
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Kafka
>    Affects Versions: 1.12.2
>            Reporter: Dawid Wysakowicz
>            Assignee: Dong Lin
>            Priority: Major
>              Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=14364&view=logs&j=72d4811f-9f0d-5fd0-014a-0bc26b72b642&t=c1d93a6a-ba91-515d-3196-2ee8019fbda7
> {code}
> [ERROR] 
> testMigrateFromAtExactlyOnceToAtLeastOnce(org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase)
>   Time elapsed: 2.013 s  <<< ERROR!
> org.apache.kafka.common.protocol.types.SchemaException: Error reading field 
> 'api_keys': Error reading field 'api_key': java.nio.BufferUnderflowException
>       at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:110)
>       at 
> org.apache.kafka.common.protocol.ApiKeys.parseResponse(ApiKeys.java:324)
>       at 
> org.apache.kafka.common.protocol.ApiKeys$1.parseResponse(ApiKeys.java:162)
>       at 
> org.apache.kafka.clients.NetworkClient.parseStructMaybeUpdateThrottleTimeMetrics(NetworkClient.java:719)
>       at 
> org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:833)
>       at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:556)
>       at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:262)
>       at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)
>       at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
>       at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:161)
>       at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:484)
>       at 
> org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1267)
>       at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1235)
>       at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1168)
>       at 
> org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironmentImpl.getAllRecordsFromTopic(KafkaTestEnvironmentImpl.java:274)
>       at 
> org.apache.flink.streaming.connectors.kafka.KafkaTestBase.assertExactlyOnceForTopic(KafkaTestBase.java:333)
>       at 
> org.apache.flink.streaming.connectors.kafka.KafkaTestBase.assertExactlyOnceForTopic(KafkaTestBase.java:303)
>       at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase.testMigrateFromAtExactlyOnceToAtLeastOnce(FlinkKafkaProducerITCase.java:598)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>       at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>       at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>       at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>       at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>       at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>       at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>       at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>       at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>       at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to