Hi Kostas,

Thank you for replying. I am already using the ability to set part suffix.
I was not aware of this issue - https://issues.apache.org/
jira/browse/FLINK-9603. Thanks for pointing out, I'll make sure to use the
1.5.3 version of the sink.

Thanks
Lakshmi


On Thu, Jul 12, 2018 at 4:55 AM, vino yang <yanghua1...@gmail.com> wrote:

> Hi Kostas, good job!
>
> 2018-07-12 19:40 GMT+08:00 Kostas Kloudas <k.klou...@data-artisans.com>:
>
> > Hi Lakshmi,
> >
> > Since Flink-1.5 you have the ability to set the part suffix.
> > As you said, you only want the .gzip to be the suffix of the final (or
> > “completed”) part files, which is exactly what is currently supported.
> >
> > If you want also intermediate files to have this suffix, then you can
> > always set all the suffixes (in-progress, pending and final) to “.gzip”
> > but then you have to also set the appropriate preffixes so that Flink can
> > distinguish completed from non-completed files (filenames
> > must not collide).
> >
> > Also I would recommend to use the most recent stable version 1.5.3 which
> > also includes this bug fix:
> > https://issues.apache.org/jira/browse/FLINK-9603 <
> > https://issues.apache.org/jira/browse/FLINK-9603>
> >
> > I hope this helps,
> > Kostas
> >
> >
> > > On Apr 5, 2018, at 6:23 PM, Lakshmi Gururaja Rao <l...@lyft.com>
> wrote:
> > >
> > > I can see two ways of achieving this:
> > >
> > > 1. Setting a suffix* **only*** for the completed part files. I don't
> > > necessarily think the suffix should be added for the intermediate files
> > (as
> > > intermediate files should not really be ready for consumption by a
> > > downstream process?)
> > > 2. Be able to override this partPath name creation -
> > > https://github.com/apache/flink/blob/release-1.4.0/
> > flink-connectors/flink-connector-filesystem/src/main/
> > java/org/apache/flink/streaming/connectors/fs/
> > bucketing/BucketingSink.java#L523
> > > . That way any user who needs to set a custom/dynamic part file name
> can
> > > still do so.
> > >
> > > Do you think either or one of these options is feasible?
> > >
> > > Thanks
> > > Lakshmi
> > >
> > > On Tue, Apr 3, 2018 at 12:57 AM, Aljoscha Krettek <aljos...@apache.org
> >
> > > wrote:
> > >
> > >> So you want to be able to set a "global" suffix that should be
> appended
> > to
> > >> all different kinds of files that the sink writes, including
> > intermediate
> > >> files?
> > >>
> > >> Aljoscha
> > >>
> > >>> On 29. Mar 2018, at 16:59, l...@lyft.com wrote:
> > >>>
> > >>> Sorry, I meant "I don't see a way of doing this apart from setting a
> > >> part file *suffix* with the required file extension. "
> > >>>
> > >>>
> > >>> On 2018/03/29 14:55:43, l...@lyft.com <l...@lyft.com> wrote:
> > >>>> Currently the BucketingSink allows addition of part prefix, pending
> > >> prefix/suffix and in-progress prefix/suffix via setter methods. Can we
> > also
> > >> support setting part suffixes?
> > >>>> An instance where this maybe useful: I am currently writing GZIP
> > >> compressed output to S3 using the BucketingSink and I would want the
> > >> uploaded files to have a ".gz" or ".zip" extensions (if the files does
> > not
> > >> have such an extensionelse they are written as garbled bytes and don't
> > get
> > >> rendered correctly for reading). I don't see a way of doing this apart
> > from
> > >> setting a part file prefix with the required file extension.
> > >>>>
> > >>>> Thanks
> > >>>> Lakshmi
> > >>>>
> > >>
> > >>
> > >
> > >
> > > --
> > > *Lakshmi Gururaja Rao*
> > > SWE
> > > 217.778.7218 <+12177787218>
> > > [image: Lyft] <http://www.lyft.com/>
> >
> >
>



-- 
*Lakshmi Gururaja Rao*
SWE
217.778.7218 <+12177787218>
[image: Lyft] <http://www.lyft.com/>

Reply via email to