Alexandre Vermeerbergen created STORM-2851:
----------------------------------------------
Summary:
org.apache.storm.kafka.spout.KafkaSpout.doSeekRetriableTopicPartitions
sometimes throws ConcurrentModificationException
Key: STORM-2851
URL: https://issues.apache.org/jira/browse/STORM-2851
Project: Apache Storm
Issue Type: Bug
Components: storm-kafka-client
Affects Versions: 1.2.0
Environment: Using Storm 1.2.0 preview binaries shared by Stig Rohde
Døssing & Jungtaek Lim through the "[Discuss] Release Storm 1.2.0" discussion
is Storm Developer's mailing list
With one Nimbus Vm, 6 Supervisor VMs, 3 Zookeeper VMs, 15 topologies, talking
with a 5 VMs Kafka Brokers set (based on Kafka 0.10.2), all with ORACLE Server
JRE 8 update 152.
About 15 topologies, handling around 1 million Kafka messages per minute, and
connected to Redis, OpenTSDB & HBase.
Reporter: Alexandre Vermeerbergen
Hello,
We have been running Storm 1.2.0 preview on our pre-production supervision
system.
We noticed that in the logs of our topology to logs persistency in Hbase, we
got the following exceptions (about 4 times in a 48 hours period):
java.util.ConcurrentModificationException at
java.util.HashMap$HashIterator.nextNode(HashMap.java:1442)
at java.util.HashMap$KeyIterator.next(HashMap.java:1466)
at
org.apache.storm.kafka.spout.KafkaSpout.doSeekRetriableTopicPartitions(KafkaSpout.java:347)
at org.apache.storm.kafka.spout.KafkaSpout.pollKafkaBroker(KafkaSpout.java:320)
at org.apache.storm.kafka.spout.KafkaSpout.nextTuple(KafkaSpout.java:245)
at
org.apache.storm.daemon.executor$fn__4963$fn__4978$fn__5009.invoke(executor.clj:647)
at org.apache.storm.util$async_loop$fn__557.invoke(util.clj:484)
at clojure.lang.AFn.run(AFn.java:22)
at java.lang.Thread.run(Thread.java:748)
It looks like there's something to fix here, such as making the map
thread-safe, or managing the exclusivity of modification of this map at a
caller level.
Note: this topology is using Storm Kafka Client spout with default properties
(unlike other topologies we have based on autocommit). However, it's the one
which deals with highest rate of messages (line of logs coming from about 10000
VMs, a nice scale test for Storm :))
Could it be fixed in Storm 1.2.0 final version?
Best regards,
Alexandre Vermeerbergen
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)