Re: BucketingSink vs StreamingFileSink

2018-11-21 Thread Edward Alexander Rojas Clavijo
Thank you very much for the information Andrey. I'll try on my side to do the migration of what we have now and try to add the sink with Parquet and I'll be back to you if I have more questions :) Edward El vie., 16 nov. 2018 a las 19:54, Andrey Zagrebin (< and...@data-artisans.com>) escribió:

Re: BucketingSink vs StreamingFileSink

2018-11-16 Thread Andrey Zagrebin
Hi, StreamingFileSink is supposed to subsume BucketingSink which will be deprecated. StreamingFileSink fixes some issues of BucketingSink, especially with AWS s3 and adds more flexibility with defining the rolling policy. StreamingFileSink does not support older hadoop versions at the moment,

BucketingSink vs StreamingFileSink

2018-11-16 Thread Edward Rojas
Hello, We are currently using Flink 1.5 and we use the BucketingSink to save the result of job processing to HDFS. The data is in JSON format and we store one object per line in the resulting files. We are planning to upgrade to Flink 1.6 and we see that there is this new StreamingFileSink,