Hello again! We occasionally see this stacktrace when running our stateful topology:
190875 [Thread-53-viewTrackerBolt-executor[31 31]] ERROR org.apache.storm.daemon.executor - java.lang.RuntimeException: Invalid prepared state for commit, preparedState null txid 1 at org.apache.storm.state.InMemoryKeyValueState.commit(InMemoryKeyValueState.java:91) ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT] at org.apache.storm.topology.StatefulBoltExecutor.handleCheckpoint(StatefulBoltExecutor.java:90) ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT] at org.apache.storm.topology.CheckpointTupleForwarder.processCheckpoint(CheckpointTupleForwarder.java:133) [storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT] at org.apache.storm.topology.CheckpointTupleForwarder.execute(CheckpointTupleForwarder.java:71) [storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT] at org.apache.storm.daemon.executor$fn__3716$tuple_action_fn__3718.invoke(executor.clj:726) [storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT] at org.apache.storm.daemon.executor$mk_task_receiver$fn__3637.invoke(executor.clj:463) [storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT] at org.apache.storm.disruptor$clojure_handler$reify__6041.onEvent(disruptor.clj:40) [storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT] at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:435) [storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT] at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:414) [storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT] at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) [storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT] at org.apache.storm.daemon.executor$fn__3716$fn__3729$fn__3780.invoke(executor.clj:841) [storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT] at org.apache.storm.util$async_loop$fn__285.invoke(util.clj:484) [storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66-internal] I haven't really been able to pinpoint when this happens, but it's semi-random and only occurs 1 run out of 20 maybe in our 10 minute test run. Just thought I would post this here if anybody immediately sees what the problem is. Cheers Alexander
