Hi Ryan,
great pointer, I just ended up with the same conclusion. Increasing Storm debug
level to “DEBUG” I can see offset problems like:
2017-05-10 15:28:36.332 o.a.s.k.s.KafkaSpout Thread-13-kafkaSpout-executor[5 5]
[DEBUG] Unexpected offset found [342356].
OffsetEntry{topic-partition=indexing-0, fetchOffset=2988315,
committedOffset=2988314, ackedMsgs=[{topic-partition=indexing-0, offset=342356,
numFails=0}, {topic-partition=indexing-0, off
set=342357, numFails=0}
I have seen that before on another installation, but only while the backlog was
growing into Kafka retention time and Kafka purged older messages before they
could be consumed.
But here the retention is configured to 10GB while only 5.4G reside in the
topic.
I have restarted the topology with offset strategy LATEST (instead of
UNCOMMITTED_EARLIEST) now and don’t run into the 10k msg limit anymore. I’ll do
a little more Kafka debugging and probably switching back to
UNCOMMITTED_EARLIEST to see if the issue is resolved now. If not, my first
guess would be that the spout cannot write the offset back and tries to start
over again. I’ve seen a similar (non-Metron related) problem on the Storm ML a
few weeks ago. If it is resolved, then it just failed once (or the topic was
broken as you mentioned) and running with LATEST fixed it by writing a more
recent (and available) offset.
BR,
Christian
Von: Ryan Merriman <[email protected]>
Antworten an: "[email protected]" <[email protected]>
Datum: Mittwoch, 10. Mai 2017 um 17:14
An: "[email protected]" <[email protected]>
Betreff: Re: ES silently stops indexing after 10k messages
Christian,
We happened to run into this exact situation a couple days ago while tuning the
indexing topology. In our case we were testing HDFS write performance and ES
wasn't involved at all. We eventually tracked it down to a bad topic and we
suspect the offsets were messed up somehow. Creating a new topic resolved the
problem. Not sure what the root cause is at the moment but hopefully this
workaround can get you past it. Hope this helps.
Ryan
On Wed, May 10, 2017 at 8:56 AM, Christian Tramnitz
<[email protected]<mailto:[email protected]>> wrote:
Is anyone aware of a config setting that could cause the indexing topology to
stop after writing 10k messages?
I have a parser running, writing into the indexing topic. Upon restart, the
indexing topology (just ES enabled for now) picks up the latest (not oldest
non-committed!) messages and puts 10000 into the current index. Then it
silently stops. No logs (ES, Kafka, Storm) give an indication what’s preventing
it from going on. When restarting the topology, it does another round of 10k
messages and stops again. Very weird.
This is on a cluster build from Friday’s master and bare metal installation
according to the documentation.
Thanks,
Christian