[ https://issues.apache.org/jira/browse/CASSANDRA-16218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218634#comment-17218634 ]
Yifan Cai commented on CASSANDRA-16218: --------------------------------------- The benchmark result shows the new approach takes *14% more time* on calculating the serialized size. Attached the benchmark. So.. it is probably not worthy. We may instead just fix CASSANDRA-16103 and pay more attention when reviewing changes in serialize() and serializedSize(). {code:java} c-16218 [java] Benchmark Mode Cnt Score Error Units [java] SerializationSizeBench.getSerializeSize avgt 6 2.290 ± 0.489 ns/op trunk [java] Benchmark Mode Cnt Score Error Units [java] SerializationSizeBench.getSerializeSize avgt 6 2.003 ± 0.163 ns/op{code} > Simplify the almost duplicated code for calculating serialization size and > serialization > ---------------------------------------------------------------------------------------- > > Key: CASSANDRA-16218 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16218 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Client, Messaging/Internode > Reporter: Yifan Cai > Assignee: Yifan Cai > Priority: Normal > Fix For: 4.0-beta > > Attachments: SerializationSizeBench.java > > > The current pattern of counting the serialization size and the actual > serialization in the codebase is error-prone and hard to maintain. Those 2 > code paths have almost the same code repeated. > > The pattern can be found in {{org.apache.cassandra.net.Message#Serializer}} > and numerous locations that use {{org.apache.cassandra.transport.CBCodec}}. > > I am proposing a new approach that lets both methods share the same code > path. The code basically looks like the below (in the case of > {{org.apache.cassandra.net.Message#Serializer}}). > {code:java} > // A fake DataOutputPlus that simply increment the size when invoking write* > methods > public class SizeCountingDataOutput implements DataOutputPlus > { > private int size = 0; > public int getSize() > { > return size; > } > public void write(int b) > { > size += 1; > } > public void write(byte[] b) > { > size += b.length; > } > ... > } > {code} > Therefore, in the size calculation, we can supply the fake data output and > call serialize to get the size. > {code:java} > private <T> int serializedSize(Message<T> message, int version) > { > SizeCountingDataOutput out = new SizeCountingDataOutput(); > try > { > serialize(message, out, version); > } > catch (IOException exception) > { > throw new AssertionError("It should not happen!", exception); > } > return out.getSize(); > // The original implementation > // return version >= VERSION_40 ? serializedSizePost40(message, version) : > serializedSizePre40(message, version); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org