[jira] [Commented] (CASSANDRA-5664) Improve serialization in the native protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746560#comment-13746560 ] Daniel Norberg commented on CASSANDRA-5664: --- Yeah, the patches as they are now look good. The stuff I brought up can definitely be iterated on in later patches if desired. Improve serialization in the native protocol Key: CASSANDRA-5664 URL: https://issues.apache.org/jira/browse/CASSANDRA-5664 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 2.0 Attachments: 0001-Rewrite-encoding-methods.txt, 0002-Avoid-copy-when-compressing-native-protocol-frames.txt Message serialization in the native protocol currently make a Netty's ChannelBuffers.wrappedBuffer(). The rational was to avoid copying of the values bytes when such value are biggish. This has a cost however, especially with lots of small values, and as suggested in CASSANDRA-5422, this might well be a more common scenario for Cassandra, so let's consider directly serializing in a newly allocated buffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5664) Improve serialization in the native protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746563#comment-13746563 ] Daniel Norberg commented on CASSANDRA-5664: --- As for figuring out the size of the encoded string without encoding it: http://stackoverflow.com/a/8512877 =) Improve serialization in the native protocol Key: CASSANDRA-5664 URL: https://issues.apache.org/jira/browse/CASSANDRA-5664 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 2.0 Attachments: 0001-Rewrite-encoding-methods.txt, 0002-Avoid-copy-when-compressing-native-protocol-frames.txt Message serialization in the native protocol currently make a Netty's ChannelBuffers.wrappedBuffer(). The rational was to avoid copying of the values bytes when such value are biggish. This has a cost however, especially with lots of small values, and as suggested in CASSANDRA-5422, this might well be a more common scenario for Cassandra, so let's consider directly serializing in a newly allocated buffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5664) Improve serialization in the native protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746575#comment-13746575 ] Jonathan Ellis commented on CASSANDRA-5664: --- We've been using TypeSizes.encodedUTF8Length for a while in inter-node serialization. Improve serialization in the native protocol Key: CASSANDRA-5664 URL: https://issues.apache.org/jira/browse/CASSANDRA-5664 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 2.0 Attachments: 0001-Rewrite-encoding-methods.txt, 0002-Avoid-copy-when-compressing-native-protocol-frames.txt Message serialization in the native protocol currently make a Netty's ChannelBuffers.wrappedBuffer(). The rational was to avoid copying of the values bytes when such value are biggish. This has a cost however, especially with lots of small values, and as suggested in CASSANDRA-5422, this might well be a more common scenario for Cassandra, so let's consider directly serializing in a newly allocated buffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5664) Improve serialization in the native protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726026#comment-13726026 ] Daniel Norberg commented on CASSANDRA-5664: --- Some quick observations: 1. There's quite a bit of string encoding and object serialization going on in some of the encodedSize() methods. This means that strings/objects will be encoded/serialized twice. 2. byte[] allocation and copying in encode() should be possible to avoid when serializing strings by careful use of ChannelBuffer.toByteBuffer(), CharBuffer.wrap() and CharsetEncoder.encode(). 3. It might be worth investigating if the code duplication in encode() and encodedSize() can be eliminated by e.g. having encode() operate on a higher level buffer interface with writeString()/writeValue()/etc methods (i.e. a lot of the writeXYZ() methods in CBUtil) and having a counting implementation of this interface. The counting implementation could simply sum up the size of output without performing any actual writing/encoding, while a writing implementation would perform encoding/serialization and write to a ChannelBuffer. Then encode() could be used for both calculating the size of the output buffer and the actual serialization. Improve serialization in the native protocol Key: CASSANDRA-5664 URL: https://issues.apache.org/jira/browse/CASSANDRA-5664 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 2.0 Attachments: 0001-Rewrite-encoding-methods.txt, 0002-Avoid-copy-when-compressing-native-protocol-frames.txt Message serialization in the native protocol currently make a Netty's ChannelBuffers.wrappedBuffer(). The rational was to avoid copying of the values bytes when such value are biggish. This has a cost however, especially with lots of small values, and as suggested in CASSANDRA-5422, this might well be a more common scenario for Cassandra, so let's consider directly serializing in a newly allocated buffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira