Re: Data exchange between tasks (operators/sources) at streaming api runtime

Ishwara Varnasi Thu, 25 Jan 2018 07:59:26 -0800

Yes, makes sense, I think consider one of those better options. Thanks! 
Ishwara


Sent from my iPhone

> On Jan 25, 2018, at 7:12 AM, Piotr Nowojski <pi...@data-artisans.com> wrote:
> 
> If you want to go this way, you could:
> - as you proposed use some busy waiting with reading some file from a 
> distributed file system
> - wait for some network message (opening your own socket)
> - use some other external system for this purpose: Kafka? Zookeeper?  
> 
> Although all of them seems hacky and I would prefer (as I proposed before) to 
> pre compute those ids before running/starting the main Flink application. 
> Probably would be simpler and easier to maintain.
> 
> Piotrek
> 
>> On 25 Jan 2018, at 13:47, Ishwara Varnasi <ivarn...@gmail.com> wrote:
>> 
>> The FLIP-17 is promising. Until it’s available I’m planning to do this: 
>> extend Kafka consumer and add logic to hold consuming until other source 
>> (fixed set) completes sending and those messages are processed by the 
>> application. However the question is to how to let the Kafka consumer know 
>> that it should now start consuming messages. What is the correct way to 
>> broadcast messages to other tasks at runtime? I’d success with the 
>> distributed cache (ie write status to a file in one task and other looks for 
>> status in this file), but doesn’t look like good solution although works. 
>> Thanks for the pointers.
>> Ishwara Varnasi 
>> 
>> Sent from my iPhone
>> 
>>> On Jan 25, 2018, at 4:03 AM, Piotr Nowojski <pi...@data-artisans.com> wrote:
>>> 
>>> Hi,
>>> 
>>> As far as I know there is currently no simple way to do this:
>>> Join stream with static data in 
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API
>>> and
>>> https://issues.apache.org/jira/browse/FLINK-6131
>>> 
>>> One walk around might be to buffer on the state the Kafka input in your 
>>> TwoInput operator until all of the broadcasted messages have arrived.
>>> Another option might be to dynamically start your application. First run 
>>> some computation to determine the fixed lists of ids and start the flink 
>>> application with those values hardcoded in/passed via command line 
>>> arguments.
>>> 
>>> Piotrek 
>>> 
>>>> On 25 Jan 2018, at 04:10, Ishwara Varnasi <ivarn...@gmail.com> wrote:
>>>> 
>>>> Hello,
>>>> I have a scenario where I've two sources, one of them is source of fixed 
>>>> list of ids for preloading (caching certain info which is slow) and second 
>>>> one is the kafka consumer. I need to run Kafka after first one completes. 
>>>> I need a mechanism to let the Kafka consumer know that it can start 
>>>> consuming messages. How can I achieve this?
>>>> thanks
>>>> Ishwara Varnasi
>>> 
>

Re: Data exchange between tasks (operators/sources) at streaming api runtime

Reply via email to