[ 
https://issues.apache.org/jira/browse/FLINK-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993335#comment-15993335
 ] 

ASF GitHub Bot commented on FLINK-6364:
---------------------------------------

Github user StefanRRichter commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3801#discussion_r114325070
  
    --- Diff: 
flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java
 ---
    @@ -265,9 +281,64 @@ private boolean hasRegisteredState() {
                        final CheckpointStreamFactory streamFactory,
                        CheckpointOptions checkpointOptions) throws Exception {
     
    +           if (checkpointOptions.getCheckpointType() == 
CheckpointOptions.CheckpointType.INCREMENTAL_CHECKPOINT) {
    +                   return snapshotIncrementally(checkpointId, timestamp, 
streamFactory);
    +           } else {
    +                   return snapshotFully(checkpointId, timestamp, 
streamFactory);
    +           }
    +   }
    +
    +   private RunnableFuture<KeyedStateHandle> snapshotIncrementally(
    +                   final long checkpointId,
    --- End diff --
    
    This method has some amount of duplicated code with `snapshotFully`. My 
suggestion would be target for some common abstract strategy common to all 
checkpointing, e.g. `takeSnapshot`, `materialize`, and `releaseResources`. Then 
we could have a factory that, depending on the checkpoint type, instantiates 
the right implementation. This would also help with a second concern: I can see 
that the code in this class has grown a lot over time, so this could be the 
time to move some aspects like checkpointing strategies into separated classes 
(they are already static inner classes anyways). By splitting a bit between 
processing logic and snapshot/restore logic, we can keep things more modular 
and separation of concerns.
    
    We can also do this cleanup in a followup PR.


> Implement incremental checkpointing in RocksDBStateBackend
> ----------------------------------------------------------
>
>                 Key: FLINK-6364
>                 URL: https://issues.apache.org/jira/browse/FLINK-6364
>             Project: Flink
>          Issue Type: Sub-task
>          Components: State Backends, Checkpointing
>            Reporter: Xiaogang Shi
>            Assignee: Xiaogang Shi
>
> {{RocksDBStateBackend}} is well suited for incremental checkpointing because 
> RocksDB is base on LSM trees,  which record updates in new sst files and all 
> sst files are immutable. By only materializing those new sst files, we can 
> significantly improve the performance of checkpointing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to