ableegoldman commented on a change in pull request #10360:
URL: https://github.com/apache/kafka/pull/10360#discussion_r598007541



##########
File path: 
streams/src/main/java/org/apache/kafka/streams/state/internals/ValueAndTimestampSerializer.java
##########
@@ -105,7 +105,7 @@ public void close() {
     }
 
     private static boolean timeIsDecreasing(final byte[] oldRecord, final 
byte[] newRecord) {
-        return extractTimestamp(newRecord) < extractTimestamp(oldRecord);
+        return extractTimestamp(newRecord) <= extractTimestamp(oldRecord);

Review comment:
       This is a bit unfortunate since it essentially breaks the emit-on-change 
semantics 😕  I guess it should be relatively rare for no-op updates to come in 
with the same timestamp, but this still seems like kind of a structural failure 
of Kafka Streams. We shouldn't need to assume that if we find an identical 
record in the state store then we have to forward it on the off-chance it was 
(1) actually the same record, and (2) was only partially processed when we 
happened across an unexpected exception.
   
   I'm not saying I have a better idea that could be implemented quickly & 
safely given the releases we're blocking, but we should at least file a ticket 
so users are aware of this flaw. Otherwise I imagine we might be getting bug 
reports that emit-on-change doesn't work. It's possible some of the orthogonal 
work that's been discussed in the past will end up fixing this on the side (eg 
buffering updates before commit, versioned tables, etc), whenever we finally 
get around to any of that




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to