Hello everyone,
We're currently using tumbling window in Kafka stream processing our data. One thing we have noticed is the stream time is advanced bases on the records that been processed by the processor node, not by the record for the grouping key. This means, if I have a data with key01 with timestamp01 a key02 with timestamp02 If both key01 and key02 in the same partition, they are processed by the same stream client, the stream time for them is same. If the gap between timestamp01 amd timestamp02 are bigger than the window size, the one with smaller timestamp is always dropped. We checked the code implementation, this is controlled by class KStreamWindowAggregate, located at org.apache.kafka.streams.kstream.internals. How can we keep all the logic from class KStreamWindowAggregate but only change the way it cacuates the stream time? Instead of using the object level variable:observedStreamTime, can we have a map or using state store to keep separate stream time for each key? Looking forward to your insights and feedback. Best regards, Chen
