[ 
https://issues.apache.org/jira/browse/CASSANDRA-16218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218634#comment-17218634
 ] 

Yifan Cai commented on CASSANDRA-16218:
---------------------------------------

The benchmark result shows the new approach takes *14% more time* on 
calculating the serialized size. Attached the benchmark. So.. it is probably 
not worthy. We may instead just fix CASSANDRA-16103 and pay more attention when 
reviewing changes in serialize() and serializedSize().

 
{code:java}
c-16218
     [java] Benchmark                                Mode  Cnt  Score   Error  
Units
     [java] SerializationSizeBench.getSerializeSize  avgt    6  2.290 ± 0.489  
ns/op

trunk
     [java] Benchmark                                Mode  Cnt  Score   Error  
Units
     [java] SerializationSizeBench.getSerializeSize  avgt    6  2.003 ± 0.163  
ns/op{code}
 

> Simplify the almost duplicated code for calculating serialization size and 
> serialization
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16218
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16218
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Messaging/Client, Messaging/Internode
>            Reporter: Yifan Cai
>            Assignee: Yifan Cai
>            Priority: Normal
>             Fix For: 4.0-beta
>
>         Attachments: SerializationSizeBench.java
>
>
> The current pattern of counting the serialization size and the actual 
> serialization in the codebase is error-prone and hard to maintain. Those 2 
> code paths have almost the same code repeated. 
>  
> The pattern can be found in {{org.apache.cassandra.net.Message#Serializer}} 
> and numerous locations that use {{org.apache.cassandra.transport.CBCodec}}. 
>  
> I am proposing a new approach that lets both methods share the same code 
> path. The code basically looks like the below (in the case of 
> {{org.apache.cassandra.net.Message#Serializer}}).
> {code:java}
> // A fake DataOutputPlus that simply increment the size when invoking write* 
> methods
> public class SizeCountingDataOutput implements DataOutputPlus
> {
>  private int size = 0;
>  public int getSize()
>  {
>    return size;
>  }
>  public void write(int b)
>  {
>    size += 1;
>  }
>  public void write(byte[] b)
>  {
>    size += b.length;
>  }
>  ...
> }
> {code}
> Therefore, in the size calculation, we can supply the fake data output and 
> call serialize to get the size.
> {code:java}
> private <T> int serializedSize(Message<T> message, int version)
> {
>  SizeCountingDataOutput out = new SizeCountingDataOutput();
>  try
>  {
>    serialize(message, out, version);
>  }
>  catch (IOException exception)
>  {
>    throw new AssertionError("It should not happen!", exception);
>  }
>  return out.getSize();
> // The original implementation
> // return version >= VERSION_40 ? serializedSizePost40(message, version) : 
> serializedSizePre40(message, version);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to