Hi all,  I am currently running some jobs coded in Beam in streaming mode on 
Yarn session by Flink. My data sink was CSV files like the one in examples of 
TfIdf. And I noticed that the output format for Beam is to produce one file for 
every record, and also temp files for them. That would result in my space used 
exceed maximum.   I am not sure whether is the problem that I used the API 
incorrectly but I am wondering if there any way I can put all those records 
into one file, or keep updating in that file, or delete those tempt files by 
windowing or triggering?

Reply via email to