Re: CSV format and hdfs

2024-04-28 Thread gongzhongqiang
Hi Artem, I research on this and open a issue[1] , Rob Young , Alexander Fedulov and I discuss on this. We also think this performance issue can be solved by manual flush. I had opened a pr[2]. You can cherry pick and package on your local, replace the jar in lib folder. I'm willing to hear

Re: CSV format and hdfs

2024-04-25 Thread Robert Young
Hi Artem, I had a debug of Flink 1.17.1 (running CsvFilesystemBatchITCase) and I see the same behaviour. It's the same on master too. Jackson flushes [1] the underlying stream after every `writeValue` call. I experimented with disabling the flush by disabling Jackson's FLUSH_PASSED_TO_STREAM [2]