carp84 commented on pull request #8751: URL: https://github.com/apache/flink/pull/8751#issuecomment-804291991
Thanks for the clarification @StephanEwen . I think size-based and time-based snapshot are two options each with their advantage. The size-based snapshot for log truncation/consolidation could indeed prevent the small file problem but will also have possibly longer recovery time (we will replay as much as 0.95 x write_buffer_size logs during restore before the size-based flush triggered). The time-based snapshot could compact/truncate the log at fixed pace but may introduce small file problem, unless we introduce some mechanism similar to the proposal here, which I think is still valuable. And just to confirm, are we aiming at completely replacing the snapshot-based checkpoint with log-based checkpoint in the future? Or both will be reserved for different user scenario? If the later, I think we still need to resolve the small file problems for snapshot-based checkpoint. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
