[ https://issues.apache.org/jira/browse/KAFKA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466417#comment-16466417 ]
Matthias J. Sax commented on KAFKA-6875: ---------------------------------------- Did you see this error: {noformat} org.apache.kafka.streams.errors.StreamsException: stream-thread [appId-1-d0047b36-c57d-4a10-a22e-5aee4ea3d449-StreamThread-39] Failed to rebalance. at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:840) at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:788) at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:749) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:718) Caused by: org.apache.kafka.streams.errors.ProcessorStateException: task directory [/tmp/kafka-17851027973441391293/appDir/appId-1/0_1] doesn't exist and couldn't be created at org.apache.kafka.streams.processor.internals.StateDirectory.directoryForTask(StateDirectory.java:98) at org.apache.kafka.streams.processor.internals.ProcessorStateManager.<init>(ProcessorStateManager.java:70) at org.apache.kafka.streams.processor.internals.AbstractTask.<init>(AbstractTask.java:90) at org.apache.kafka.streams.processor.internals.StreamTask.<init>(StreamTask.java:161) at org.apache.kafka.streams.processor.internals.StreamTask.<init>(StreamTask.java:146) at org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:429) at org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:381) at org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.createTasks(StreamThread.java:366) at org.apache.kafka.streams.processor.internals.TaskManager.addStreamTasks(TaskManager.java:148) at org.apache.kafka.streams.processor.internals.TaskManager.createTasks(TaskManager.java:107) at org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:268) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:259) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:367) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:290) at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1148) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1114) at org.apache.kafka.streams.processor.internals.ConsumerUtils.poll(ConsumerUtils.java:33) at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:831) ... 3 more {noformat} Might be an environmental issue? How often did it fail so far? > EosIntegrationTest#shouldNotViolateEosIfOneTaskFails is flaky > ------------------------------------------------------------- > > Key: KAFKA-6875 > URL: https://issues.apache.org/jira/browse/KAFKA-6875 > Project: Kafka > Issue Type: Test > Components: streams, unit tests > Reporter: Ted Yu > Priority: Minor > Labels: newbie++ > Attachments: EosIntegrationTest.out > > > From > https://builds.apache.org/job/kafka-trunk-jdk10/81/testReport/junit/org.apache.kafka.streams.integration/EosIntegrationTest/shouldNotViolateEosIfOneTaskFails/ > : > {code} > java.lang.AssertionError: Condition not met within timeout 60000. SteamsTasks > did not request commit. > at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:276) > at > org.apache.kafka.streams.integration.EosIntegrationTest.shouldNotViolateEosIfOneTaskFails(EosIntegrationTest.java:339) > {code} > From test output: > {code} > [2018-05-07 19:04:18,236] ERROR [Controller id=2 epoch=3] Controller 2 epoch > 3 failed to change state for partition __transaction_state-34 from > OnlinePartition to OnlinePartition (state.change.logger:76) > kafka.common.StateChangeFailedException: Failed to elect leader for partition > __transaction_state-34 under strategy > ControlledShutdownPartitionLeaderElectionStrategy > at > kafka.controller.PartitionStateMachine.$anonfun$doElectLeaderForPartitions$9(PartitionStateMachine.scala:328) > at > scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:59) > at > scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:52) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at > kafka.controller.PartitionStateMachine.doElectLeaderForPartitions(PartitionStateMachine.scala:326) > at > kafka.controller.PartitionStateMachine.electLeaderForPartitions(PartitionStateMachine.scala:254) > at > kafka.controller.PartitionStateMachine.doHandleStateChanges(PartitionStateMachine.scala:175) > at > kafka.controller.PartitionStateMachine.handleStateChanges(PartitionStateMachine.scala:116) > at > kafka.controller.KafkaController$ControlledShutdown.doControlledShutdown(KafkaController.scala:1055) > at > kafka.controller.KafkaController$ControlledShutdown.$anonfun$process$1(KafkaController.scala:1031) > at scala.util.Try$.apply(Try.scala:209) > at > kafka.controller.KafkaController$ControlledShutdown.process(KafkaController.scala:1031) > at > kafka.controller.ControllerEventManager$ControllerEventThread.$anonfun$doWork$1(ControllerEventManager.scala:69) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)