While I also agree with the trade-offs mentioned by @vvcephei, we can't say 
exactly what the better approach will be without testing.  To me, the bigger 
savings potential would be in CPU  but again we can't say without testing.  

But we do need to serialize for sending to the changelog, and even if we only 
send on `flush` and couple that with the fact that a `byte[]`  coming in does 
not always get deserialized due to updates by key.  So I'm starting to think to 
go with either approach will be a wash.

So, for now, I'm leaning towards storing `byte[]`

1. That's what we currently use for `KTable`, while that by itself is not 
enough of a reason, IMHO we need to be careful about having different 
approaches for similar issues without a clear, demonstratable reason for doing 
so.
2. Benchmarking will really give us the answers we are looking for, but time is 
something we don't have right now for getting this into 2.1
3. I could be wrong about this but I think the biggest users of suppression are 
going to have several updates per key, so as @mjsax mentions, many of the 
`byte[] arrays` are going get overwritten.

[ Full content available at: https://github.com/apache/kafka/pull/5693 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to