aliehsaeedii commented on code in PR #14626: URL: https://github.com/apache/kafka/pull/14626#discussion_r1388177181
########## streams/src/main/java/org/apache/kafka/streams/state/internals/StoreQueryUtils.java: ########## @@ -351,6 +383,21 @@ public static <V> Function<byte[], V> getDeserializeValue(final StateSerdes<?, V return byteArray -> deserializer.deserialize(serdes.topic(), byteArray); } + public static <V> ValueIterator<VersionedRecord<V>> deserializeValueIterator(final StateSerdes<?, V> serdes, + final ValueIterator<VersionedRecord<byte[]>> rawValueIterator) { + + final List<VersionedRecord<V>> versionedRecords = new ArrayList<>(); + while (rawValueIterator.hasNext()) { + final VersionedRecord<byte[]> rawVersionedRecord = rawValueIterator.peek(); + final Deserializer<V> valueDeserializer = serdes.valueDeserializer(); + final long timestamp = rawVersionedRecord.timestamp(); + final long validTo = rawVersionedRecord.validTo(); + final V value = valueDeserializer.deserialize(serdes.topic(), rawVersionedRecord.value()); Review Comment: > This approach is better as it avoids expensive "ad hoc / upfront" deserialization cost (for example, a user might not even iterator over everything and close the iterator early, and even if they iterate fully, we avoid a "spike" and stretch out the deserialization overhead over a longer time frame what is preferable) > > IMHO, we should try to do this no matter how we decide on the other open question (but it might make sense to hold off changing the code, until we figure a solution for the other open questions, too). Thanks Matthias. Yes, I got that. That's why, in my latest commit, I implemented lazy deserialization. Sorry, I just did not update my comment here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org