Re: [jira] [Created] (FLINK-2695) KafkaITCase.testConcurrentProducerConsumerTopology failed on Travis
The tests use a ZooKeeper mini cluster and multiple Kafka MiniClusters. It appears that these are not very stable in our test setup. Let's see what we can do to improve reliability there. 1) As a first step, I would suggest to reduce the number of concurrent tests to one for this project, as it will prevent that we have multiple concurrent mini clusters competing for compute resources. 2) The method "SimpleConsumerThread.getLastOffset()" Should probably re-retrieve the leader, or we should allow the program more recovery retries... Greetings, Stephan On Wed, Sep 23, 2015 at 4:04 AM, Li, Chengxiang <chengxiang...@intel.com> wrote: > Found more KafkaITCase failure at: > https://travis-ci.org/apache/flink/jobs/81592146 > > Failed tests: > > KafkaITCase.testConcurrentProducerConsumerTopology:50->KafkaConsumerTestBase.runSimpleConcurrentProducerConsumerTopology:334->KafkaTestBase.tryExecute:313 > Test failed: The program execution failed: Job execution failed. > Tests in error: > > KafkaITCase.testCancelingEmptyTopic:57->KafkaConsumerTestBase.runCancelingOnEmptyInputTest:594 > » > > KafkaITCase.testCancelingFullTopic:62->KafkaConsumerTestBase.runCancelingOnFullInputTest:529 > » > > KafkaITCase.testMultipleSourcesOnePartition:89->KafkaConsumerTestBase.runMultipleSourcesOnePartitionExactlyOnceTest:450 > » ProgramInvocation > > KafkaITCase.testOffsetInZookeeper:45->KafkaConsumerTestBase.runOffsetInZookeeperValidationTest:205->KafkaConsumerTestBase.writeSequence:938 > » ProgramInvocation > > KafkaITCase.testOneToOneSources:79->KafkaConsumerTestBase.runOneToOneExactlyOnceTest:356 > » ProgramInvocation > > It happens only on the test mode of JDK: oraclejdk8 > PROFILE="-Dhadoop.version=2.5.0 -Dmaven.javadoc.skip=true". > > Thanks > Chengxiang > > -Original Message----- > From: Till Rohrmann (JIRA) [mailto:j...@apache.org] > Sent: Thursday, September 17, 2015 11:02 PM > To: dev@flink.apache.org > Subject: [jira] [Created] (FLINK-2695) > KafkaITCase.testConcurrentProducerConsumerTopology failed on Travis > > Till Rohrmann created FLINK-2695: > > > Summary: KafkaITCase.testConcurrentProducerConsumerTopology > failed on Travis > Key: FLINK-2695 > URL: https://issues.apache.org/jira/browse/FLINK-2695 > Project: Flink > Issue Type: Bug > Reporter: Till Rohrmann > Priority: Critical > > > The {{KafkaITCase.testConcurrentProducerConsumerTopology}} failed on > Travis with > > {code} > --- > T E S T S > --- > Running org.apache.flink.streaming.connectors.kafka.KafkaProducerITCase > Running org.apache.flink.streaming.connectors.kafka.KafkaITCase > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.296 sec > - in org.apache.flink.streaming.connectors.kafka.KafkaProducerITCase > 09/16/2015 17:19:36 Job execution switched to status RUNNING. > 09/16/2015 17:19:36 Source: Custom Source -> Sink: Unnamed(1/1) > switched to SCHEDULED > 09/16/2015 17:19:36 Source: Custom Source -> Sink: Unnamed(1/1) > switched to DEPLOYING > 09/16/2015 17:19:36 Source: Custom Source -> Sink: Unnamed(1/1) > switched to RUNNING > 09/16/2015 17:19:36 Source: Custom Source -> Sink: Unnamed(1/1) > switched to FINISHED > 09/16/2015 17:19:36 Job execution switched to status FINISHED. > 09/16/2015 17:19:36 Job execution switched to status RUNNING. > 09/16/2015 17:19:36 Source: Custom Source -> Map -> Flat Map(1/1) > switched to SCHEDULED > 09/16/2015 17:19:36 Source: Custom Source -> Map -> Flat Map(1/1) > switched to DEPLOYING > 09/16/2015 17:19:36 Source: Custom Source -> Map -> Flat Map(1/1) > switched to RUNNING > 09/16/2015 17:19:36 Source: Custom Source -> Map -> Flat Map(1/1) > switched to FAILED > java.lang.Exception: Could not forward element to next operator > at > org.apache.flink.streaming.connectors.kafka.internals.LegacyFetcher.run(LegacyFetcher.java:242) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer.run(FlinkKafkaConsumer.java:382) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:58) > at > org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:58) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:171) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) > at java.
[jira] [Created] (FLINK-2695) KafkaITCase.testConcurrentProducerConsumerTopology failed on Travis
Till Rohrmann created FLINK-2695: Summary: KafkaITCase.testConcurrentProducerConsumerTopology failed on Travis Key: FLINK-2695 URL: https://issues.apache.org/jira/browse/FLINK-2695 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Priority: Critical The {{KafkaITCase.testConcurrentProducerConsumerTopology}} failed on Travis with {code} --- T E S T S --- Running org.apache.flink.streaming.connectors.kafka.KafkaProducerITCase Running org.apache.flink.streaming.connectors.kafka.KafkaITCase Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.296 sec - in org.apache.flink.streaming.connectors.kafka.KafkaProducerITCase 09/16/2015 17:19:36 Job execution switched to status RUNNING. 09/16/2015 17:19:36 Source: Custom Source -> Sink: Unnamed(1/1) switched to SCHEDULED 09/16/2015 17:19:36 Source: Custom Source -> Sink: Unnamed(1/1) switched to DEPLOYING 09/16/2015 17:19:36 Source: Custom Source -> Sink: Unnamed(1/1) switched to RUNNING 09/16/2015 17:19:36 Source: Custom Source -> Sink: Unnamed(1/1) switched to FINISHED 09/16/2015 17:19:36 Job execution switched to status FINISHED. 09/16/2015 17:19:36 Job execution switched to status RUNNING. 09/16/2015 17:19:36 Source: Custom Source -> Map -> Flat Map(1/1) switched to SCHEDULED 09/16/2015 17:19:36 Source: Custom Source -> Map -> Flat Map(1/1) switched to DEPLOYING 09/16/2015 17:19:36 Source: Custom Source -> Map -> Flat Map(1/1) switched to RUNNING 09/16/2015 17:19:36 Source: Custom Source -> Map -> Flat Map(1/1) switched to FAILED java.lang.Exception: Could not forward element to next operator at org.apache.flink.streaming.connectors.kafka.internals.LegacyFetcher.run(LegacyFetcher.java:242) at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer.run(FlinkKafkaConsumer.java:382) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:58) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:58) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:171) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Could not forward element to next operator at org.apache.flink.streaming.runtime.tasks.OutputHandler$CopyingChainingOutput.collect(OutputHandler.java:332) at org.apache.flink.streaming.runtime.tasks.OutputHandler$CopyingChainingOutput.collect(OutputHandler.java:316) at org.apache.flink.streaming.runtime.io.CollectorWrapper.collect(CollectorWrapper.java:50) at org.apache.flink.streaming.runtime.io.CollectorWrapper.collect(CollectorWrapper.java:30) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$SourceOutput.collect(SourceStreamTask.java:92) at org.apache.flink.streaming.api.operators.StreamSource$NonTimestampContext.collect(StreamSource.java:88) at org.apache.flink.streaming.connectors.kafka.internals.LegacyFetcher$SimpleConsumerThread.run(LegacyFetcher.java:449) Caused by: java.lang.RuntimeException: Could not forward element to next operator at org.apache.flink.streaming.runtime.tasks.OutputHandler$CopyingChainingOutput.collect(OutputHandler.java:332) at org.apache.flink.streaming.runtime.tasks.OutputHandler$CopyingChainingOutput.collect(OutputHandler.java:316) at org.apache.flink.streaming.runtime.io.CollectorWrapper.collect(CollectorWrapper.java:50) at org.apache.flink.streaming.runtime.io.CollectorWrapper.collect(CollectorWrapper.java:30) at org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:37) at org.apache.flink.streaming.runtime.tasks.OutputHandler$CopyingChainingOutput.collect(OutputHandler.java:329) ... 6 more Caused by: org.apache.flink.streaming.connectors.kafka.testutils.SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaConsumerTestBase$7.flatMap(KafkaConsumerTestBase.java:896) at org.apache.flink.streaming.connectors.kafka.KafkaConsumerTestBase$7.flatMap(KafkaConsumerTestBase.java:876) at org.apache.flink.streaming.api.operators.StreamFlatMap.processElement(StreamFlatMap.java:47) at org.apache.flink.streaming.runtime.tasks.OutputHandler$CopyingChainingOutput.collect(OutputHandler.java:329) ... 11 more 09/16/2015 17:19:36 Job execution switched to status FAILING. 09/16/2015 17:19:36 Job execution switched to status FAILED. org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed. at