GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/937
[FLINK-2324] [streaming] Partitioned state checkpointing rework + test
update
This commit reworks the way partitioned operator states are checkpointed
eliminating per-key checkpointing to increase performance and also updates the
StreamCheckpointingITCase to use partitioned state to check exactly-once
semantics.
It also eliminates a double checkpoint commit issue at head operators in
chains.
**Rework of partitioned state checkpointing**
Previously each OperatorState method would return a PartitionedStateHandle
object upon snapshotting which would contain a Map of Key -> StateHandle,
storing the partitioned states by key. This had two negative implications:
* At every checkpoint we stored the states for each key independently
(slowing the checkpoints and overloading the storage layer)
* We had to keep the keys in-memory (this might cause memory issues with
large keys)
This has been changed in this commit to store the map of states at each
operator in 1 statehandle, minimizing the time it takes to take a snapshot and
also minimizing the load on the storage layer. This might change slightly when
we start thinking about state repartitioning for operator scaling.
**Update StreamCheckpointingITCase with partitioned states**
In the StreamCheckpointingITCase I replaced the RichReduceFunction (which
was working incorrectly as it is not checkpointed) with a RichFlatMapFunction
implementing the stateful group reduce functionality using partitioned states,
to count the number of strings received for a specific prefix.
Using this we now test that exactly-once processing happens.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gyfora/flink partitioned-state-cleanup
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/937.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #937
----
commit fd0f77a5523b5819724a7152e27d23dad03d02f3
Author: Gyula Fora <[email protected]>
Date: 2015-07-26T16:02:56Z
[FLINK-2324] [streaming] Partitioned state checkpointing rework + test
update
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---