Chuckame commented on PR #14157:
URL: https://github.com/apache/kafka/pull/14157#issuecomment-2147032733

   After deep diving into the `KeyValueStore` and `KTableValueGetter` 
implementation layers, there is still one blocker that prevents the full fix:
   
   All the KTableValueGetter implementations that uses runtime mappers (based 
on the deserialized data) are not materialized so we don't have any backing 
store. It needs to serialize the mapped data on-the-fly to allow hashing the 
raw data, finally preventing the use of original raw data for hashing:
   - `KTableMap[Values]ValueGetter`
   - `KTableFilterValueGetter`
   - `KTableTransformValuesGetter`
   - `KTableKTableAbstractJoins` (for ktable-ktable joins on the same key)
   
   In other words, if just before the foreign-key join we make an operation 
like `join`/`leftJoin` (same key), `map`, `mapValues`, `filter` or 
`transformValues`, there is no previous raw data as it is computed on-the-fly 
so it has no backing store.
   
   Even if we revamp totally the raw store layer as @guozhangwang suggested, we 
will still have the same issue.
   
   ### Idea
   We could generate the hash on the original raw data **before** 
mapping/transforming, but this would be a breaking change as the hash will be 
different if a user is upgrading kafka-streams to this version (previously the 
hash were computed from the mapped value).
   
   This change would need a new version for `SubscriptionResponseWrapper` 
(currently v0).
   
   Pros:
   - We now have access to the original raw to bypass the deserialization step
   - We gain in performances as we do not `deserialize -> transform -> 
serialize -> hash` but just `hash`
   
   Cons:
   - Breaking change for actual hashes, users need to empty the stores or all 
the events triggered by the right side will be skipped as the hash will be 
always different (current bug that we have actually), the reason why we need to 
introduce the version 1 of `SubscriptionResponseWrapper`
   
   Would you allow this breaking change @mjsax ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to