[ https://issues.apache.org/jira/browse/KAFKA-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297570#comment-16297570 ]
Guozhang Wang commented on KAFKA-6323: -------------------------------------- [~frederica] I'd suggest the following: 1) STREAM_TIME punctuation: I agree with [~mjsax] for aligning on scheduled timestamp; to be more specific, suppose user provided interval is {{T}}, we would first schedule the next timstamp as {{T}} exactly; then at any point suppose our next scheduled the next timestamp {{T1}}, and stream time has advanced to {{T2}} because of received data where {{T2 >= T1}}, then we just punctuate with parameter {{floor(T2, T)}} and schedule the next punctuation at {{floor(T2, T) + T}}. 2) WALL_CLOCK_TIME: we do not try to align on interval, i.e. with user provided interval {{T}}, next scheduled time is {{now + T}}, and at the time we did the check with scheduled timestamp {{T1}}, if the current system time is {{T2 (T2 >= T1)}} we punctuate at {{T2}} and schedule the next punctuation at timestamp {{T2 + T}}. The argument is that with long GC / single-record-taking-long-time-to-process / etc scenarios, we can never have a precise or predictable punctuation based on system wall-clock time, so instead we'd just try to expose the exact current system time when punctuation is triggered. > punctuate with WALL_CLOCK_TIME triggered immediately > ---------------------------------------------------- > > Key: KAFKA-6323 > URL: https://issues.apache.org/jira/browse/KAFKA-6323 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 1.0.0 > Reporter: Frederic Arno > Assignee: Frederic Arno > Fix For: 1.1.0, 1.0.1 > > > When working on a custom Processor from which I am scheduling a punctuation > using WALL_CLOCK_TIME. I've noticed that whatever the punctuation interval I > set, a call to my Punctuator is always triggered immediately. > Having a quick look at kafka-streams' code, I could find that all > PunctuationSchedule's timestamps are matched against the current time in > order to decide whether or not to trigger the punctuator > (org.apache.kafka.streams.processor.internals.PunctuationQueue#mayPunctuate). > However, I've only seen code that initializes PunctuationSchedule's timestamp > to 0, which I guess is what is causing an immediate punctuation. > At least when using WALL_CLOCK_TIME, shouldn't the PunctuationSchedule's > timestamp be initialized to current time + interval? -- This message was sent by Atlassian JIRA (v6.4.14#64029)