[ https://issues.apache.org/jira/browse/KAFKA-13386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sergio Duran Vegas updated KAFKA-13386: --------------------------------------- Summary: Foreign Key Join filtering out valid records after a code change / schema evolved (was: Foreign Key Join filtering valid records after a code change / schema evolved) > Foreign Key Join filtering out valid records after a code change / schema > evolved > --------------------------------------------------------------------------------- > > Key: KAFKA-13386 > URL: https://issues.apache.org/jira/browse/KAFKA-13386 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 2.6.2 > Reporter: Sergio Duran Vegas > Priority: Major > > The join optimization assumes the serializer is deterministic and invariant > across upgrades. I can recall other discussions in which we wanted to rely on > the same property, for example when computing whether an update is a > duplicate result or not. > > [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/internals/foreignkeyjoin/SubscriptionResolverJoinProcessorSupplier.java] > > {code:java} > //If this value doesn't match the current value from the original table, it > is stale and should be discarded. > if (java.util.Arrays.equals(messageHash, currentHash)) {{code} > > A solution for this problem would be that the comparison use foreign-key > reference itself instead of the whole message hash. > > The bug fix proposal is to be allow the user to choose between one method of > comparison or another (whole hash or Fk reference). This would fix the > problem of dropping valid records on certain cases and allow the user to also > choose the current optimized way of checking valid records and intermediate > results dropping. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)