Github user srdo commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2218#discussion_r129109171
  
    --- Diff: docs/Windowing.md ---
    @@ -266,3 +266,105 @@ tuples can be received within the timeout period.
     An example toplogy `SlidingWindowTopology` shows how to use the apis to 
compute a sliding window sum and a tumbling window 
     average.
     
    +## Stateful windowing
    +The default windowing implementation in storm stores the tuples in memory 
until they are processed and expired from the 
    +window. This limits the use cases to windows that
    +fit entirely in memory. Also the source tuples cannot be ack-ed until the 
window expiry requiring large message timeouts
    +(topology.message.timeout.secs should be larger than the window length + 
sliding interval). This also puts extra loads 
    +due to the complex acking and anchoring requirements.
    + 
    +To address the above limitations and to support larger window sizes, storm 
provides stateful windowing support via `IStatefulWindowedBolt`. 
    +User bolts should typically extend `BaseStatefulWindowedBolt` for the 
windowing operations with the framework automatically 
    +managing the state of the window in the background.
    +
    +If the sources provide a monotonically increasing identifier as a part of 
the message, the framework can use this
    +to periodically checkpoint the last expired and evaluated message ids, to 
avoid duplicate window evaluations in case of 
    +failures or restarts. During recovery, the tuples with message ids lower 
than last expired id are discarded and tuples with 
    +message id between the last expired and last evaluated message ids are fed 
into the system without activating any triggers. 
    --- End diff --
    
    Nit: I'm assuming the triggers referred to here are the TriggerPolicy 
instances? I don't think triggers are mentioned elsewhere in this doc, could we 
put a bit about them in?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to