GitHub user gyfora opened a pull request:

    https://github.com/apache/flink/pull/937

    [FLINK-2324] [streaming] Partitioned state checkpointing rework + test 
update

    This commit reworks the way partitioned operator states are checkpointed 
eliminating per-key checkpointing to increase performance and also updates the 
StreamCheckpointingITCase to use partitioned state to check exactly-once 
semantics. 
    
    It also eliminates a double checkpoint commit issue at head operators in 
chains.
    
    **Rework of partitioned state checkpointing**
    Previously each OperatorState method would return a PartitionedStateHandle 
object upon snapshotting which would contain a Map of Key -> StateHandle, 
storing the partitioned states by key. This had two negative implications:
    * At every checkpoint we stored the states for each key independently 
(slowing the checkpoints and overloading the storage layer)
    * We had to keep the keys in-memory (this might cause memory issues with 
large keys)
    
    This has been changed in this commit to store the map of states at each 
operator in 1 statehandle, minimizing the time it takes to take a snapshot and 
also minimizing the load on the storage layer. This might change slightly when 
we start thinking about state repartitioning for operator scaling.
    
    **Update StreamCheckpointingITCase with partitioned states**
    In the StreamCheckpointingITCase I replaced the RichReduceFunction (which 
was working incorrectly as it is not checkpointed) with a RichFlatMapFunction 
implementing the stateful group reduce functionality using partitioned states, 
to count the number of strings received for a specific prefix.
    Using this we now test that exactly-once processing happens.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gyfora/flink partitioned-state-cleanup

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/937.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #937
    
----
commit fd0f77a5523b5819724a7152e27d23dad03d02f3
Author: Gyula Fora <[email protected]>
Date:   2015-07-26T16:02:56Z

    [FLINK-2324] [streaming] Partitioned state checkpointing rework + test 
update

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to