Hi, IIUC, numRetainedCheckpoints will only influence the space overhead of checkpoint dir, but not the incremental size. RocksDB executes incremental checkpoint based on the shard directory which will always remain SST Files as much as possible (maybe it's from the last checkpoint, or maybe from long long ago). numRetainedCheckpoints just makes flink remain more cp-x directory and SST Files in shared directory not used in the next incremental checkpoint. Whether it's 1 or 3, the size of the incremental checkpoint should be similar.
Could you check your configuration, source status, job status, etc again to find whether there are any other differences ? On Mon, Dec 19, 2022 at 9:00 PM Puneet Duggal <puneetduggal1...@gmail.com> wrote: > Hi, > > After going through the following article regarding rocksdb incremental > checkpoint ( > https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html), > my understanding was that at each checkpoint, flink only checkpoints newly > created SSTables whereas other it can reference from earlier checkpoints > (depending upon num of retained checkpoints). > > So can we assume from this that if numRetainedCheckpoints = 1 (default), > behaviour is similar as checkpointing comeplete data as it is (same as non > incremental checkpointing). > > Also performed a load test by running exactly same flink job on 2 > different clusters. Only difference between all these clusters were > numOfRetained checkpoints. > > Incremental Checkpoint Load Test > > Cluster 1 > > num Retained Checkpoints = 3 > > > Cluster 2 > > num Retained Checkpoints = 1 > > > > As we can see, checkpoint data size for cluster with num of retained > checkpoints = 1 is less than one with greater number of retained > checkpoints. > > > -- Best, Hangxiang.