Re: streaming output in just one files

Aljoscha Krettek Wed, 09 Aug 2017 02:21:17 -0700

Hi,

I think Flink should not create one output file per record. Could you maybe 
post a snipped or minimal code example that shows how you're setting up your 
sinks?


Best,
Aljoscha
> On 8. Aug 2017, at 19:08, Claire Yuan <clairey...@yahoo-inc.com> wrote:
> 
> Hi all,
>   I am currently running some jobs coded in Beam in streaming mode on Yarn 
> session by Flink. My data sink was CSV files like the one in examples of 
> TfIdf. And I noticed that the output format for Beam is to produce one file 
> for every record, and also temp files for them. That would result in my space 
> used exceed maximum. 
>   I am not sure whether is the problem that I used the API incorrectly but I 
> am wondering if there any way I can put all those records into one file, or 
> keep updating in that file, or delete those tempt files by windowing or 
> triggering?
> 
> Claire

Re: streaming output in just one files

Reply via email to