Hi, I think Flink should not create one output file per record. Could you maybe post a snipped or minimal code example that shows how you're setting up your sinks?
Best, Aljoscha > On 8. Aug 2017, at 19:08, Claire Yuan <clairey...@yahoo-inc.com> wrote: > > Hi all, > I am currently running some jobs coded in Beam in streaming mode on Yarn > session by Flink. My data sink was CSV files like the one in examples of > TfIdf. And I noticed that the output format for Beam is to produce one file > for every record, and also temp files for them. That would result in my space > used exceed maximum. > I am not sure whether is the problem that I used the API incorrectly but I > am wondering if there any way I can put all those records into one file, or > keep updating in that file, or delete those tempt files by windowing or > triggering? > > Claire