[GitHub] [flink] carp84 commented on pull request #8751: [FLINK-11937][StateBackend]Resolve small file problem in RocksDB incremental checkpoint

GitBox Mon, 22 Mar 2021 08:37:36 -0700


carp84 commented on pull request #8751:
URL: https://github.com/apache/flink/pull/8751#issuecomment-804158779



   @StephanEwen please correct me if I'm wrong, but I think the issue FLIP-158 
is trying to resolve is orthogonal with this one (FLINK-11937).
   
   On one hand, IIUC, the snapshot interval (for generating SST files (take 
RocksDB for example) to truncate change-logs) in FLIP-158 design would be 
configurable, and if it's set to some value similar to the old checkpoint 
interval, eg. 10min, then we will have similar small file problem as observed 
now. Actually I don't think this snapshot interval should be too long since it 
will decide how much logs to replay during restore thus affecting the recovery 
speed.
   
   OTOH, I'm not sure about the value of using change-log based checkpoint with 
long checkpoint interval (like more than 10min). Saving change logs will 
consume additional network bandwidth and disk space (since the SST uploading 
process is reserved for log truncation) and increase the latency of routine 
record processing (for "double-writing"), which is a good trade-off for 
momentary checkpoint interval but not that effective with long ones, IMHO.
   
   Not sure whether I'm missing anything, and please let me know your thoughts. 
Thanks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] carp84 commented on pull request #8751: [FLINK-11937][StateBackend]Resolve small file problem in RocksDB incremental checkpoint

Reply via email to