[
https://issues.apache.org/jira/browse/STORM-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067993#comment-15067993
]
ASF GitHub Bot commented on STORM-1176:
---------------------------------------
GitHub user arunmahadevan opened a pull request:
https://github.com/apache/storm/pull/963
[STORM-1176] Checkpoint window evaluated/expired state
This uses the stateful bolt abstractions introduced in PR# 939 to
checkpoint the last expired and last evaluated tuple message ids.
IStatefulWindowedBolt exposes the framework managed state to the windowed bolts
so that the state of the windowing operation can be saved. The last expired and
last evaluated message ids are saved by `StatefulWindowedBoltExecutor` along
with the state of the windowing operation. This last expired/evaluated ids are
used during recovery to avoid duplicate window evaluations. This still provides
at least once semantics but minimizes the duplicate window evaluations.
Note: This PR includes the commits proposed in
https://github.com/apache/storm/pull/939 since it builds on top of that.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/arunmahadevan/storm windowing-statefulwindow
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/storm/pull/963.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #963
----
commit a0fc48042fd8da20c9e4b748b3f73a1c4bbd4868
Author: Arun Mahadevan <[email protected]>
Date: 2015-11-30T11:06:10Z
[STORM-1175] State store for windowing operations
Added IStatefulBolt abstraction that can be implemented by bolts which
requires
its state to be checkpointed periodically. State implementations based on
key-value mapping store are added. There is a default in-memory based
implementation
and optional redis based implementation that provides state persistence. An
internal
CheckpointSpout periodically emits checkpoint tuples which flows through the
topology DAG to take a consistent snapshot of the state across all
components.
There is still pending work to capture the evaluated/expired state of the
tuples
in the Window and use it to prune duplicate window evaluations during
restart. This can
be built on top of the stateful bolt abstractions and will be done as part
of STORM-1176
commit 8f929251a62e9bfc47b39f9c4f6640ab2d2a63b5
Author: Arun Mahadevan <[email protected]>
Date: 2015-12-10T20:09:12Z
Refactored unit tests
commit 92e528a6beed74b8568878c30756d4bcb8e31405
Author: Arun Mahadevan <[email protected]>
Date: 2015-12-22T06:58:30Z
Refactoring for accomodating windowing with state
commit c9af41870366e4c8b6f2dd0d86770c09859459bd
Author: Arun Mahadevan <[email protected]>
Date: 2015-12-22T07:12:53Z
[STORM-1176] Checkpoint window evaluated/expired state
This builds on top of the stateful bolt abstractions introduced in
PR# 939 and uses it to checkpoint the last expired and last evaluated
tuple message ids. IStatefulWindowedBolt exposes the framework managed state
to the users windowed bolt so that the state of the windowing
operation can be saved. The last expired and last evaluated message ids
are saved behind the scene along with the state of the windowing operation
and this is
used during recovery to avoid duplicate window evaluations. This still
provides
atleast once semantics but reduces the duplicate window evaluations.
----
> Checkpoint window evaluated state and use this to prune duplicate evaluations
> -----------------------------------------------------------------------------
>
> Key: STORM-1176
> URL: https://issues.apache.org/jira/browse/STORM-1176
> Project: Apache Storm
> Issue Type: Sub-task
> Reporter: Arun Mahadevan
> Assignee: Arun Mahadevan
>
> Evaluated state of sliding/tumbling windows should be checkpointed
> periodically and on event replay during restart this info should be used to
> prune duplicate evaluations.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)