[ 
https://issues.apache.org/jira/browse/CASSANDRA-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13726026#comment-13726026
 ] 

Daniel Norberg commented on CASSANDRA-5664:
-------------------------------------------

Some quick observations:

1. There's quite a bit of string encoding and object serialization going on in 
some of the encodedSize() methods. This means that strings/objects will be 
encoded/serialized twice.

2. byte[] allocation and copying in encode() should be possible to avoid when 
serializing strings by careful use of ChannelBuffer.toByteBuffer(), 
CharBuffer.wrap() and CharsetEncoder.encode().

3. It might be worth investigating if the code duplication in encode() and 
encodedSize() can be eliminated by e.g. having encode() operate on a higher 
level buffer interface with writeString()/writeValue()/etc methods (i.e. a lot 
of the writeXYZ() methods in CBUtil) and having a counting implementation of 
this interface. The counting implementation could simply sum up the size of 
output without performing any actual writing/encoding, while a writing 
implementation would perform encoding/serialization and write to a 
ChannelBuffer. Then encode() could be used for both calculating the size of the 
output buffer and the actual serialization.

                
> Improve serialization in the native protocol
> --------------------------------------------
>
>                 Key: CASSANDRA-5664
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5664
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 2.0
>
>         Attachments: 0001-Rewrite-encoding-methods.txt, 
> 0002-Avoid-copy-when-compressing-native-protocol-frames.txt
>
>
> Message serialization in the native protocol currently make a Netty's 
> ChannelBuffers.wrappedBuffer(). The rational was to avoid copying of the 
> values bytes when such value are biggish. This has a cost however, especially 
> with lots of small values, and as suggested in CASSANDRA-5422, this might 
> well be a more common scenario for Cassandra, so let's consider directly 
> serializing in a newly allocated buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to