[jira] [Commented] (CASSANDRA-11421) Eliminate allocations of byte array for UTF8 String serializations
[ https://issues.apache.org/jira/browse/CASSANDRA-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239693#comment-15239693 ] T Jake Luciani commented on CASSANDRA-11421: Will update once CASSANDRA-11567 is in > Eliminate allocations of byte array for UTF8 String serializations > -- > > Key: CASSANDRA-11421 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11421 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: Nitsan Wakart >Assignee: Nitsan Wakart > > When profiling a read workload (YCSB workload c) on Cassandra 3.2.1 I noticed > a large part of allocation profile was generated from String.getBytes() calls > on CBUtil::writeString > I have fixed up the code to use a thread local cached ByteBuffer and > CharsetEncoder to eliminate the allocations. This results in improved > allocation profile, and a mild improvement in performance. > The fix is available here: > https://github.com/nitsanw/cassandra/tree/fix-write-string-allocation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11421) Eliminate allocations of byte array for UTF8 String serializations
[ https://issues.apache.org/jira/browse/CASSANDRA-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214163#comment-15214163 ] T Jake Luciani commented on CASSANDRA-11421: Sounds like we should just upgrade to latest netty 4.0 version vs copying over the new method in tree > Eliminate allocations of byte array for UTF8 String serializations > -- > > Key: CASSANDRA-11421 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11421 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: Nitsan Wakart >Assignee: Nitsan Wakart > > When profiling a read workload (YCSB workload c) on Cassandra 3.2.1 I noticed > a large part of allocation profile was generated from String.getBytes() calls > on CBUtil::writeString > I have fixed up the code to use a thread local cached ByteBuffer and > CharsetEncoder to eliminate the allocations. This results in improved > allocation profile, and a mild improvement in performance. > The fix is available here: > https://github.com/nitsanw/cassandra/tree/fix-write-string-allocation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11421) Eliminate allocations of byte array for UTF8 String serializations
[ https://issues.apache.org/jira/browse/CASSANDRA-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15211651#comment-15211651 ] Nitsan Wakart commented on CASSANDRA-11421: --- See last commit on branch which brings over the missing Netty util method. > Eliminate allocations of byte array for UTF8 String serializations > -- > > Key: CASSANDRA-11421 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11421 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: Nitsan Wakart >Assignee: Nitsan Wakart > > When profiling a read workload (YCSB workload c) on Cassandra 3.2.1 I noticed > a large part of allocation profile was generated from String.getBytes() calls > on CBUtil::writeString > I have fixed up the code to use a thread local cached ByteBuffer and > CharsetEncoder to eliminate the allocations. This results in improved > allocation profile, and a mild improvement in performance. > The fix is available here: > https://github.com/nitsanw/cassandra/tree/fix-write-string-allocation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11421) Eliminate allocations of byte array for UTF8 String serializations
[ https://issues.apache.org/jira/browse/CASSANDRA-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210420#comment-15210420 ] Nitsan Wakart commented on CASSANDRA-11421: --- Fair point. You could set a max size and fall back to whichever method suits. The length can be corrected post serialization, e.g. : int writerIndex = cb.writerIndex(); cb.writeShort(0); int lengthBytes = ByteBufUtilTemp.writeUtf8(cb, str); cb.setShort(writerIndex, lengthBytes); > Eliminate allocations of byte array for UTF8 String serializations > -- > > Key: CASSANDRA-11421 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11421 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Nitsan Wakart > > When profiling a read workload (YCSB workload c) on Cassandra 3.2.1 I noticed > a large part of allocation profile was generated from String.getBytes() calls > on CBUtil::writeString > I have fixed up the code to use a thread local cached ByteBuffer and > CharsetEncoder to eliminate the allocations. This results in improved > allocation profile, and a mild improvement in performance. > The fix is available here: > https://github.com/nitsanw/cassandra/tree/fix-write-string-allocation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11421) Eliminate allocations of byte array for UTF8 String serializations
[ https://issues.apache.org/jira/browse/CASSANDRA-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210410#comment-15210410 ] Benedict commented on CASSANDRA-11421: -- I wouldn't necessarily consider it a good idea to cache buffers of unbounded size. Right now it is probably fine, but it is fragile in the face of possible future changes. This is a common problem that should probably be solved generically, though no doubt not here. It's a tremendous shame this encoding differs from DataOutputPlus.writeUTF, and that both persist a length up-front. > Eliminate allocations of byte array for UTF8 String serializations > -- > > Key: CASSANDRA-11421 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11421 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Nitsan Wakart > > When profiling a read workload (YCSB workload c) on Cassandra 3.2.1 I noticed > a large part of allocation profile was generated from String.getBytes() calls > on CBUtil::writeString > I have fixed up the code to use a thread local cached ByteBuffer and > CharsetEncoder to eliminate the allocations. This results in improved > allocation profile, and a mild improvement in performance. > The fix is available here: > https://github.com/nitsanw/cassandra/tree/fix-write-string-allocation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11421) Eliminate allocations of byte array for UTF8 String serializations
[ https://issues.apache.org/jira/browse/CASSANDRA-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210061#comment-15210061 ] Nitsan Wakart commented on CASSANDRA-11421: --- Note that there's a better solution using the later Netty ByteBufUtil::encodeUtf8 https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/ByteBufUtil.java#L379 This would require updating your dependencies or copying the code over as temporary measure. > Eliminate allocations of byte array for UTF8 String serializations > -- > > Key: CASSANDRA-11421 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11421 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Nitsan Wakart > > When profiling a read workload (YCSB workload c) on Cassandra 3.2.1 I noticed > a large part of allocation profile was generated from String.getBytes() calls > on CBUtil::writeString > I have fixed up the code to use a thread local cached ByteBuffer and > CharsetEncoder to eliminate the allocations. This results in improved > allocation profile, and a mild improvement in performance. > The fix is available here: > https://github.com/nitsanw/cassandra/tree/fix-write-string-allocation -- This message was sent by Atlassian JIRA (v6.3.4#6332)