Re: SplittableDoFn with Flink fails at checkpointing larger files (200MB)

2020-02-25 Thread Maximilian Michels
Hi Marek, That's a great question. The answer depends on whether you are using portability or the "classic" Runner: Portability === In portability, the SDF functionality includes the option for the Runner to split a given bundle such that the remaining current bundle's work will be

SplittableDoFn with Flink fails at checkpointing larger files (200MB)

2020-02-07 Thread marek-simunek
Hi,    I am using FileIO with continuously watching folder for new files to process. The problem is when flink starts reading 200MB file (around 3M elements) and also starts checkpointing. Checkpoint never finishes until WHOLE file is processed. Minimal example :