Github user mattf-horton commented on the issue:

    https://github.com/apache/incubator-metron/pull/481
  
    @cestella , thanks for looking at this. The primary motivation for adding 
batchTimeout was to prevent tuple recycling due to 
"topology.message.timeout.secs".  Thus, correctly configuring the timeout 
period requires interacting with the Bolt.  You're correct that this means each 
Bolt that uses `BulkWriterComponent` must be modified to include the tick tuple 
processing, as noted in my opening comments:
    ```
    After this patch is reviewed and accepted, similar work needs to be done 
for the ParserWriter, and possibly other sub-components. That will be in a 
separate PR.
    ```
    I implemented the changes in `BulkWriterComponent` such that it would 
default to conservative behavior if the containing Bolt didn't configure it.
    
    I considered using a timer thread instead of tick tuples, but:
    1. This is precisely one of the use cases contemplated by the Storm team 
when they created Tick Tuples, as discussed in the [article 
here](https://hortonworks.com/blog/apache-storm-design-pattern-micro-batching/) 
cited in the jira for METRON-322.
    1. It isn't sufficient to just create a timer thread.  One must also 
monitor that thread, be able to restart it if it dies, make sure it doesn't do 
anything non-thread-safe, etc.  These add significant complexity to the code, 
and uncertainty in the case of the thread-safeness, since any pattern we create 
here will surely be imitated by other developments down the road, and Bolt code 
is not typically thread-safe.
    1. On the other hand, using the built-in Tick Tuples avoids both the 
complexity, since it handles the reliability issues internal to Storm, and 
uncertainty, since the Tick Tuple is processed in the single flow of control of 
normal Bolt processing.
    
    So I think it's cleaner to use the feature provided by the Storm 
environment.  I'm open to arguments to the contrary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to