Re: Scheduling sources

2018-09-26 Thread Averell
Hi Tison, "/setting a timer on schedule start and trigger (1) after a fixed delay/" would be quite sufficient for me. Looking forward to the change of that Jira ticket's status. Thanks for your help. Regards, Averell -- Sent from:

Re: Scheduling sources

2018-09-26 Thread Tzu-Li Chen
Hi Averell, As Till pointed out, currently Flink doesn't provide such a flexible schedule strategy. However, our team internally implemented a mechanism that allow user-define schedule plugin(flexible schedule strategy). It could fit your case by setting a timer on schedule start and trigger (1)

Re: Scheduling sources

2018-09-26 Thread Averell
Hi Kostas, So that means my 2a will be broadcasted to all TMs? Is that possible to partition that? As I'm using CoProcessFunction to join 1 with 2. Thanks and best regards, Averell -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Scheduling sources

2018-09-26 Thread Kostas Kloudas
Hi Averell, If the 2a fits in memory, then you can load the data to all TMs in the open() method of any rich function, eg. ProcessFunction [1]. The open() runs before any data is allowed to flow in your pipeline from the sources. Cheers, Kostas [1]

Re: Scheduling sources

2018-09-25 Thread Averell
Thank you Till. My use case is like this: I have two streams, one is raw data (1), the other is enrichment data (2), which in turn consists of two component: initial enrichment data (2a) which comes from an RDBMS table, and incremental data (2b) which comes from a Kafka stream. To ensure that

Re: Scheduling sources

2018-09-25 Thread Till Rohrmann
Hi Averell, such a feature is currently not supported by Flink. The scheduling works by starting all sources at the same time. Depending whether it is a batch or streaming job, you either start deploying consumers once producers have produced some results or right away. Cheers, Till On Tue, Sep

Scheduling sources

2018-09-25 Thread Averell
Hi everyone, I have 2 file sources, which I want to start reading them in a specified order (e.g: source2 should only start 5 minutes after source1 has started). I could not find any Flink document mentioning this capability, and I also tried to search the mailing list, without any success.