Large number of sources in Flink Job

2018-05-27 Thread Chirag Dewan
Hi, I am working on a use case where my Flink job needs to collect data from thousands of sources.  As an example, I want to collect data from more than 2000 File Directories, process(filter, transform) the data and distribute the processed data streams to 200 different directories. Are there

Clarification in TumblingProcessing TimeWindow Documentation

2018-05-27 Thread Dhruv Kumar
Hi I was looking at TumblingProcessingTimeWindows.java and was a bit confused with the documentation at the start of this class.

Writing Table API results to a csv file

2018-05-27 Thread chrisr123
I'm using Flink 1.4.0 I'm trying to save the results of a Table API query to a CSV file, but I'm getting an error. Here are the details: My Input file looks like this: id,species,color,weight,name 311,canine,golden,75,dog1 312,canine,brown,22,dog2 313,feline,gray,8,cat1 I run a query on this to

Re: Checkpointing when reading from files?

2018-05-27 Thread Padarn Wilson
I'm a bit confused about this too actually. I think the above would work as a solution if you want to continuously monitor a directory, but for a "PROCESS_ONCE" readFile source I don't think you will get a checkpoint emitted indicating the end of the stream. My understanding of this is that there

Re: sharebuffer prune code

2018-05-27 Thread Dawid Wysakowicz
The logic for SharedBuffer and in result for prunning will be changed in FLINK-9418 [1]. We plan to make it backwards compatible. There is already open PR[2] (in review), you can check if the problem persists. Regards, Dawid [1] https://issues.apache.org/jira/browse/FLINK-9418 [2]