[
https://issues.apache.org/jira/browse/METRON-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706500#comment-15706500
]
ASF GitHub Bot commented on METRON-569:
---------------------------------------
Github user DomenicPuzio commented on the issue:
https://github.com/apache/incubator-metron/pull/359
Sorry for the delayed response, @cestella - I was taking some time off for
the holiday.
While I don't believe custom enrichment is pushing us over the timeout
(we've clocked it at 15ms at the worst), that's certainly a possibility and
something that we should also guard against. With this in mind, we now have two
tasks to look into:
1. Ensuring all messages get acked after a successful join by building out
the `streamMessageMap`
2. Making sure that our cache refresh is set to some fraction of the Storm
timeout interval
Are these both items that we would like to complete? Should we split these
into separate JIRAs and PRs? Should I work on the acking changes within this PR?
> Enrichment topology duplicates messages
> ---------------------------------------
>
> Key: METRON-569
> URL: https://issues.apache.org/jira/browse/METRON-569
> Project: Metron
> Issue Type: Bug
> Reporter: Domenic Puzio
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> When running the 'enrichment' topology, I get duplicate message being
> indexed. For example, I put 100 messages into the 'enrichment' Kafka queue
> and I get 175 messages onto the 'indexing' Kafka queue. This happens when I
> am running the 'enrichment' topology with one or more enrichment bolt.
> This is an acking issue within the JoinBolt class. When a message does not
> "complete" the join (like when it is the first message in a pair of message
> to get joined) it does not get acked. This means that this message will get
> replayed through Storm, causing message duplication further down the road and
> tons of additional overhead. Adding the correct acking resolves this problem.
> I will add the PR for this shortly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)