Agree with all trade-offs you mention. For KTable caches, we also went to storing `byte[]` to obey the size config. Also note, we don't need to deserialize all byte[] arrays, but only on eviction -- if we have a lot of suppression. many byte[] arrays would never the deserialized but overwritten. Depending on throughput and number if unique keys, this might happen quickly enough to still be young gen. Hard to say. Again, more input from @guozhangwang and @bbejeck would be helpful.
And as above, pre-mature optimization should be avoided. Could we do some prototyping and benchmarking of both approaches? Not sure if there is enough time. Also, it's an internal implementation and if performance becomes an issue, we ca also improve on it in 2.2. [ Full content available at: https://github.com/apache/kafka/pull/5693 ] This message was relayed via gitbox.apache.org for [email protected]
