[
https://issues.apache.org/jira/browse/STORM-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16108952#comment-16108952
]
Guang Du commented on STORM-2666:
---------------------------------
Hi [~Srdo], thanks for reply. I checked and the STORM-2544 is included in my
code. The issue is different from STORM-2544.
STORM-2544 is missing to ack offset, causing dis-continuous offsets in
OffsetManager, as a result unable to commit the missing ones.
My observation is Spout is emitting already committed offsets again, and the
newly acked duplicated ones will be added in OffsetManager. This is confusing
Spout for the 'uncommitted count', as a result could block poll(), as well as
blocking OffsetManager to find next feasible commitable offset in
org.apache.storm.kafka.spout.internal.OffsetManager#findNextCommitOffset.
Currently I have no idea where these duplicated offsets come from, my guess is
they're from the 'on the way' failed ones, which eventually will be acked by
retry service.
I'm making a temporary fix by removing offsets before committedOffset in
org.apache.storm.kafka.spout.internal.OffsetManager#findNextCommitOffset, but I
think probably this is not the best solution, and you could have better ideas.
> Kafka Client Spout send & ack committed offsets
> -----------------------------------------------
>
> Key: STORM-2666
> URL: https://issues.apache.org/jira/browse/STORM-2666
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-kafka-client
> Affects Versions: 1.1.1
> Reporter: Guang Du
>
> Under a certain heavy load, for failed/timeout tuples, the retry service will
> ack tuple for failed max times. Kafka Client Spout will commit after reached
> the commit interval. However seems some 'on the way' tuples will be failed
> again, the retry service will cause Spout to emit again, and acked eventually
> to OffsetManager.
> In some cases such offsets are too many, exceeding the max-uncommit, causing
> org.apache.storm.kafka.spout.internal.OffsetManager#findNextCommitOffset
> unable to find next commit point, and Spout for this partition will not poll
> any more.
> By the way I've applied STORM-2549 PR#2156 from Stig Døssing to fix
> STORM-2625, and I'm using Python Shell Bolt as processing bolt, if this
> information helps.
> resulting logs like below. I'm not sure if the issue has already been
> raised/fixed, glad if anyone could help to point out existing JIRA. Thank you.
> 2017-07-27 22:23:48.398 o.a.s.k.s.KafkaSpout Thread-23-spout-executor[248
> 248] [INFO] Successful ack for tuple message
> [{topic-partition=kafka_bd_trigger_action-20, offset=18204, numFails=0}].
> 2017-07-27 22:23:49.203 o.a.s.k.s.i.OffsetManager
> Thread-23-spout-executor[248 248] [WARN] topic-partition
> [kafka_bd_trigger_action-18] has unexpected offset [16002]. Current committed
> Offset [16003]
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)