Github user srdo commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2090#discussion_r117821993
  
    --- Diff: 
storm-client/src/jvm/org/apache/storm/windowing/TimeTriggerPolicy.java ---
    @@ -62,7 +62,9 @@ public void reset() {
     
         @Override
         public void start() {
    -        executorFuture = executor.scheduleAtFixedRate(newTriggerTask(), 
duration, duration, TimeUnit.MILLISECONDS);
    +        // initial delay is slightly less than the duration so that the 
initial tuples wont't expire due to time drift
    +        long initialDelay = duration - Math.min((long) (duration * .1), 
10);
    +        executorFuture = executor.scheduleAtFixedRate(newTriggerTask(), 
initialDelay, duration, TimeUnit.MILLISECONDS);
    --- End diff --
    
    Won't this still be vulnerable to losing initial tuples if the first 
execution is delayed longer than expected for some reason?
    
    I'm wondering if we could make this more reliable by tracking the time we 
consider expired instead of/in addition to tracking what time it is now. Tuples 
between the prevExpired timestamp and the nowExpired timestamp are tuples we 
can't be sure have been processed, so they should be included in the window. 
Since the expired timestamps also move ~duration every onTrigger, we still get 
the tuples expired we expect. For the first onTrigger, we can just process all 
the tuples because we can tell it's the first call since the prevExpired 
timestamp isn't set yet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to