Re: CSV format and hdfs

2024-04-25 Thread Robert Young
Hi Artem, I had a debug of Flink 1.17.1 (running CsvFilesystemBatchITCase) and I see the same behaviour. It's the same on master too. Jackson flushes [1] the underlying stream after every `writeValue` call. I experimented with disabling the flush by disabling Jackson's FLUSH_PASSED_TO_STREAM [2]

Re: Understanding checkpoint/savepoint storage requirements

2024-04-02 Thread Robert Young
eifan Wang wrote: > >> Hi Robert : >> >> Your understanding are right ! >> Add some more information : JobManager not only responsible for cleaning >> old checkpoints, but also needs to write metadata file to checkpoint >> storage after all taskmanagers have tak

Understanding checkpoint/savepoint storage requirements

2024-03-27 Thread Robert Young
Hi all, I have some questions about checkpoint and savepoint storage. >From what I understand a distributed, production-quality job with a lot of state should use durable shared storage for checkpoints and savepoints. All job managers and task managers should access the same volume. So typically