[
https://issues.apache.org/jira/browse/STORM-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124860#comment-16124860
]
Guang Du commented on STORM-2666:
---------------------------------
Thank you [~Srdo] for your detailed explanation.
First to answer your questions:
Yes it's the 1.1.x branch;
No I'm not using Kafka's topic compaction, and didn't see the log;
Given your explanation for storm's underlying mechanism, I agree my assumption
might not be valid. I think a better option would be add some more log
information to find the underlying root cause. I'll try in our environment
later, and get back with logs FYI later.
Regarding the unexpected offset, I can't remember clearly if it was always off
by one. I'll add logs to check again, and get back to you later.
Your explanation about storm msg processing was great, thank you very much and
I'll take time looking a little deeper into the clojure codes to get some
better understanding. Thank you very much. :)
Get back to you with more logs later.
> Kafka Client Spout send & ack committed offsets
> -----------------------------------------------
>
> Key: STORM-2666
> URL: https://issues.apache.org/jira/browse/STORM-2666
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-kafka-client
> Affects Versions: 1.1.1
> Reporter: Guang Du
>
> Under a certain heavy load, for failed/timeout tuples, the retry service will
> ack tuple for failed max times. Kafka Client Spout will commit after reached
> the commit interval. However seems some 'on the way' tuples will be failed
> again, the retry service will cause Spout to emit again, and acked eventually
> to OffsetManager.
> In some cases such offsets are too many, exceeding the max-uncommit, causing
> org.apache.storm.kafka.spout.internal.OffsetManager#findNextCommitOffset
> unable to find next commit point, and Spout for this partition will not poll
> any more.
> By the way I've applied STORM-2549 PR#2156 from Stig Døssing to fix
> STORM-2625, and I'm using Python Shell Bolt as processing bolt, if this
> information helps.
> resulting logs like below. I'm not sure if the issue has already been
> raised/fixed, glad if anyone could help to point out existing JIRA. Thank you.
> 2017-07-27 22:23:48.398 o.a.s.k.s.KafkaSpout Thread-23-spout-executor[248
> 248] [INFO] Successful ack for tuple message
> [{topic-partition=kafka_bd_trigger_action-20, offset=18204, numFails=0}].
> 2017-07-27 22:23:49.203 o.a.s.k.s.i.OffsetManager
> Thread-23-spout-executor[248 248] [WARN] topic-partition
> [kafka_bd_trigger_action-18] has unexpected offset [16002]. Current committed
> Offset [16003]
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)