Thanks Jesse.I know bounded should do it. But, bounded gets tricky when you
dont know how many records you may have in the data file.There is an upper
bound, but what if there are more records than the upper-bound?I can set a
counter in-memory, and check for its value. But, I need a way to interrupt
p.run().Not sure if there is something like this in Beam API...I appreciate
other folks' opinions on this topic as well....Thanks again.
From: Jesse Anderson <[email protected]>
To: amir bahmanyari <[email protected]>; "[email protected]"
<[email protected]>
Sent: Thursday, June 23, 2016 3:26 PM
Subject: Re: End-of-data indicator in Unbounded KafkaIO
You could make a bounded Kafka IO and wait for an end of file message.That
said, I don't know if Kafka is the right technology for what you're trying to
do. You might just process the files directly at that point.
On Thu, Jun 23, 2016, 3:10 PM amir bahmanyari <[email protected]> wrote:
Sorry colleagues. I know "End-of-data" & Unbounded dont go hand in hand.Lets
say I am invoking KafkaIO unbounded. But at some point I run out of streaming
data (finite number of records in my data file) and p.run() keeps
running/waiting for more data and doesn't terminate of course.How do I know
there has not been any more data recently coming to KafkaIo.read() for a given
amount of time or any other runtime indicaor?Is there a way to interrupt
p.run() upon detecting such an indicator so the execution can move on with the
rest of the code?Thanks+regardsAmir