Hi Guozhang, 

Thanks for reading over my proposal!

> Regarding the format, I'm just thinking if we can change the type of "INT 
> newDataLength" to UINT32?

Good idea, I've updated the KIP to reflect UINT32 since it makes clear the 
value can never be less than zero. 

> `.equals` default implementation on Object is by reference, so if the groupBy 
> did not generate a new object, that may still pass. This means that even if 
> user does not implement the `.equals` function, if the same object is 
> returned then this feature would still be triggered, is that correct?

Correct, I've updated the KIP to call out this edge-case clearly as follows: 

> Since the default `.equals` implementation for an `Object`  is by reference, 
> if a user's `groupBy` returns the same reference for the key, then the oldKey 
> and the newKey will naturally `.equals`  each other. This will result in a 
> single event being sent to the repartition topic. This change in behaviour 
> should be considered a "bug-fix" rather than a "breaking change" as the 
> semantics of the operation remain unchanged, the only thing that changes for 
> users is they no longer see transient "inconsistent" states.  In the worst 
> case, users in this situation will need to update any strict tests that check 
> specifically for the presence of transient "inconsistent" states.

What do you think? 

Thanks, 
Farooq

On 2023/02/07 18:02:24 Guozhang Wang wrote:
> Hello Farooq,
> 
> Thanks for the very detailed proposal! I think this is a great idea.
> Just a few thoughts:
> 
> 1. I regret that we over-optimized the Changed serde format for
> footprint while making it less extensible. It seems to me that a two
> rolling bounce migration is unavoidable.. Regarding the format, I'm
> just thinking if we can change the type of "INT newDataLength" to
> UINT32?
> 
> 2. `.equals` default implementation on Object is by reference, so if
> the groupBy did not generate a new object, that may still pass. This
> means that even if user does not implement the `.equals` function, if
> the same object is returned then this feature would still be
> triggered, is that correct?
> 
> 
> Guozhang
> 
> On Mon, Feb 6, 2023 at 5:05 AM Fq Public <fq...@gmail.com> wrote:
> >
> > Hi everyone,
> >
> > I'd like to share a new KIP for discussion:
> > https://cwiki.apache.org/confluence/x/P5VbDg
> >
> > This could be considered mostly as a "bug fix" but we wanted to raise a KIP
> > for discussion because it involves changes to the serialization format of
> > an internal topic which raises backward compatibility considerations.
> >
> > Please take a look and let me know what you think.
> >
> > Thanks,
> > Farooq
> 

Reply via email to