Github user cestella commented on the issue:

    https://github.com/apache/incubator-metron/pull/359
  
    It occurs to me that what may be happening is that an enrichment may be 
taking longer than the timeout that storm is using to wait on that ack.  If 
that is the case, I could see duplicated data.
    
    Imagine the following situation, with the storm timeout `x` and an 
enrichment taking `x + 1`.  The enrichment would finish and send the enriched 
data from the enrichment adapter to the join bolt but storm would've already 
triggered a replay.  The enrichment completing would have triggered the join to 
happen and the joined message to be emitted and the replay would trigger 
another copy of the message.
    
    In this case, I'd suggest ensuring that either your enrichment is capped at 
maximum 
http://storm.apache.org/releases/current/javadocs/org/apache/storm/Config.html#TOPOLOGY_MESSAGE_TIMEOUT_SECS
 or adjusting the message timeout in storm to be higher than this setting in 
storm.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to