One thing to add:

There are plans/ideas to change punctuate() semantics to "system time"
instead of "stream time". Would this be helpful for your use case?


-Matthias

On 2/1/17 9:41 AM, Matthias J. Sax wrote:
> Yes and no.
> 
> It does not depend on the number of tuples but on the timestamps of the
> tuples.
> 
> I would assume, that records in the high volume stream have timestamps
> that are only a few milliseconds from each other, while for the low
> volume KTable, record have timestamp differences that are much bigger
> (maybe seconds).
> 
> Thus, even if you schedule a punctuation every 30 seconds, it will get
> triggered as expected. As you get KTable input on a second basis that
> advanced KTable time in larger steps -- thus KTable always "catches up".
> 
> Only for the (real time) case, that a single partition does not make
> process because no new data gets appended that is longer than your
> punctuation interval, some calls to punctuate might not fire.
> 
> Let's say the KTable does not get an update for 5 Minutes, than you
> would miss 9 calls to punctuate(), and get only a single call after the
> KTable update. (Of course, only if all partitions advance time accordingly.)
> 
> 
> Does this make sense?
> 
> 
> -Matthias
> 
> On 2/1/17 7:37 AM, Elliot Crosby-McCullough wrote:
>> Hi there,
>>
>> I've been reading through the Kafka Streams documentation and there seems
>> to be a tricky limitation that I'd like to make sure I've understood
>> correctly.
>>
>> The docs[1] talk about the `punctuate` callback being based on stream time
>> and that all incoming partitions of all incoming topics must have
>> progressed through the minimum time interval for `punctuate` to be called.
>>
>> This seems to be like a problem for situations where you have one very fast
>> and one very slow stream being processed together, for example joining a
>> fast-moving KStream to a slow-changing KTable.
>>
>> Have I misunderstood something or is this relatively common use case not
>> supported with the `punctuate` callback?
>>
>> Many thanks,
>> Elliot
>>
>> [1]
>> http://docs.confluent.io/3.1.2/streams/developer-guide.html#defining-a-stream-processor
>> (see the "Attention" box)
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to