Jan Filipiak created KAFKA-6599:
-----------------------------------
Summary: KTable KTable join semantics violated when caching enabled
Key: KAFKA-6599
URL: https://issues.apache.org/jira/browse/KAFKA-6599
Project: Kafka
Issue Type: Bug
Components: streams
Reporter: Jan Filipiak
Say a tuple A,B got emmited after joining and the delete for A goes into the
cache. After that the B record would be deleted aswell. B's join processor
would look up A and see `null` while computing for old and new value (at this
point we can execute joiner with A beeing null and still emit something, but
its not gonna represent the actual oldValue) Then As cache flushes it doesn't
see B so its also not gonna put a proper oldValue. The output can then not be
used for say any aggregate as a delete would not reliably find its old
aggregate where it needs to be removed from filter will also break as it stopps
null,null changes from propagating. So for me it looks pretty clearly that
Caching with Join breaks KTable semantics. be it my new join or the currently
existing once.
this if branch here
[https://github.com/apache/kafka/blob/1.0/streams/src/main/java/org/apache/kafka/streams/state/internals/CachingKeyValueStore.java#L155]
is not usefull. I think its there because when one would delegate the true case
to the underlying. One would get proper semantics for streams, but the
weiredest cache I've seen.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)