GitHub user arunmahadevan opened a pull request:

    https://github.com/apache/storm/pull/963

    [STORM-1176] Checkpoint window evaluated/expired state

    This uses the stateful bolt abstractions introduced in PR# 939 to 
checkpoint the last expired and last evaluated tuple message ids. 
IStatefulWindowedBolt exposes the framework managed state to the windowed bolts 
so that the state of the windowing operation can be saved. The last expired and 
last evaluated message ids are saved by `StatefulWindowedBoltExecutor` along 
with the state of the windowing operation. This last expired/evaluated ids are 
used during recovery to avoid duplicate window evaluations. This still provides 
at least once semantics but minimizes the duplicate window evaluations.
    
    Note: This PR includes the commits proposed in 
https://github.com/apache/storm/pull/939 since it builds on top of that.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/arunmahadevan/storm windowing-statefulwindow

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/963.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #963
    
----
commit a0fc48042fd8da20c9e4b748b3f73a1c4bbd4868
Author: Arun Mahadevan <[email protected]>
Date:   2015-11-30T11:06:10Z

    [STORM-1175] State store for windowing operations
    
    Added IStatefulBolt abstraction that can be implemented by bolts which 
requires
    its state to be checkpointed periodically. State implementations based on
    key-value mapping store are added. There is a default in-memory based 
implementation
    and optional redis based implementation that provides state persistence. An 
internal
    CheckpointSpout periodically emits checkpoint tuples which flows through the
    topology DAG to take a consistent snapshot of the state across all 
components.
    
    There is still pending work to capture the evaluated/expired state of the 
tuples
    in the Window and use it to prune duplicate window evaluations during 
restart. This can
    be built on top of the stateful bolt abstractions and will be done as part 
of STORM-1176

commit 8f929251a62e9bfc47b39f9c4f6640ab2d2a63b5
Author: Arun Mahadevan <[email protected]>
Date:   2015-12-10T20:09:12Z

    Refactored unit tests

commit 92e528a6beed74b8568878c30756d4bcb8e31405
Author: Arun Mahadevan <[email protected]>
Date:   2015-12-22T06:58:30Z

    Refactoring for accomodating windowing with state

commit c9af41870366e4c8b6f2dd0d86770c09859459bd
Author: Arun Mahadevan <[email protected]>
Date:   2015-12-22T07:12:53Z

    [STORM-1176] Checkpoint window evaluated/expired state
    
    This builds on top of the stateful bolt abstractions introduced in
    PR# 939 and uses it to checkpoint the last expired and last evaluated
    tuple message ids. IStatefulWindowedBolt exposes the framework managed state
    to the users windowed bolt so that the state of the windowing
    operation can be saved. The last expired and last evaluated message ids
    are saved behind the scene along with the state of the windowing operation 
and this is
    used during recovery to avoid duplicate window evaluations. This still 
provides
    atleast once semantics but reduces the duplicate window evaluations.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to