Re: CSV StreamingFileSink

2020-02-19 Thread Austin Cawley-Edwards
Hey Timo,

Thanks for the assignment link! Looks like most of my issues can be solved
by getting better acquainted with Java file APIs and not in Flink-land.


Best,
Austin

On Wed, Feb 19, 2020 at 6:48 AM Timo Walther  wrote:

> Hi Austin,
>
> the StreamingFileSink allows bucketing the output data.
>
> This should help for your use case:
>
>
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#bucket-assignment
>
> Regards,
> Timo
>
>
> On 19.02.20 01:00, Austin Cawley-Edwards wrote:
> > Following up on this -- does anyone know if it's possible to stream
> > individual files to a directory using the StreamingFileSink? For
> > instance, if I want all records that come in during a certain day to be
> > partitioned into daily directories:
> >
> > 2020-02-18/
> > large-file-1.txt
> > large-file-2.txt
> > 2020-02-19/
> > large-file-3.txt
> >
> > Or is there another way to accomplish this?
> >
> > Thanks!
> > Austin
> >
> > On Tue, Feb 18, 2020 at 5:33 PM Austin Cawley-Edwards
> > mailto:austin.caw...@gmail.com>> wrote:
> >
> > Hey all,
> >
> > Has anyone had success using the StreamingFileSink[1] to write CSV
> > files? And if so, what about compressed (Gzipped, ideally) files/
> > which libraries did you use?
> >
> >
> > Best,
> > Austin
> >
> >
> > [1]:
> >
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html
> >
>
>


Re: CSV StreamingFileSink

2020-02-19 Thread Timo Walther

Hi Austin,

the StreamingFileSink allows bucketing the output data.

This should help for your use case:

https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#bucket-assignment

Regards,
Timo


On 19.02.20 01:00, Austin Cawley-Edwards wrote:
Following up on this -- does anyone know if it's possible to stream 
individual files to a directory using the StreamingFileSink? For 
instance, if I want all records that come in during a certain day to be 
partitioned into daily directories:


2020-02-18/
    large-file-1.txt
    large-file-2.txt
2020-02-19/
    large-file-3.txt

Or is there another way to accomplish this?

Thanks!
Austin

On Tue, Feb 18, 2020 at 5:33 PM Austin Cawley-Edwards 
mailto:austin.caw...@gmail.com>> wrote:


Hey all,

Has anyone had success using the StreamingFileSink[1] to write CSV
files? And if so, what about compressed (Gzipped, ideally) files/
which libraries did you use?


Best,
Austin


[1]:

https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html





Re: CSV StreamingFileSink

2020-02-18 Thread Austin Cawley-Edwards
Following up on this -- does anyone know if it's possible to stream
individual files to a directory using the StreamingFileSink? For instance,
if I want all records that come in during a certain day to be
partitioned into daily directories:

2020-02-18/
   large-file-1.txt
   large-file-2.txt
2020-02-19/
   large-file-3.txt

Or is there another way to accomplish this?

Thanks!
Austin

On Tue, Feb 18, 2020 at 5:33 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

> Hey all,
>
> Has anyone had success using the StreamingFileSink[1] to write CSV files?
> And if so, what about compressed (Gzipped, ideally) files/ which libraries
> did you use?
>
>
> Best,
> Austin
>
>
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html
>


CSV StreamingFileSink

2020-02-18 Thread Austin Cawley-Edwards
Hey all,

Has anyone had success using the StreamingFileSink[1] to write CSV files?
And if so, what about compressed (Gzipped, ideally) files/ which libraries
did you use?


Best,
Austin


[1]:
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html