Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/526#issuecomment-93840072
  
    So after much searching and tracing through logs, with some added logs in 
the CoordinatedBolt I found out that the CoordinatedBolt was timing out the 
batch in a few cases, if the batch took longer then 300ms to complete because 
the timeout is set to 30 seconds by default and 10 seconds of simulated time 
equals 100ms of wall time.  When this would happen the bolts would be confused 
and the batch would never be fully acked.  I am not sure why the 
coordinator/spout was not getting a timeout and replaying the batch in 
simulated time, but because it is a simulated time issue, and only really shows 
up on this one test, I decided to increase the timeout.  If others think we 
should dig deeper and understand why the replay is not happening I am happy to 
hand the JIRA over to them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to