[ 
https://issues.apache.org/jira/browse/CASSANDRA-15350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971341#comment-16971341
 ] 

Yifan Cai edited comment on CASSANDRA-15350 at 11/11/19 6:02 PM:
-----------------------------------------------------------------

-The change of calculating the exact size of UTF-8 string has negative 
performance impact. It needs to iterate through the entire string to determine 
the actual size in UTF-8. -

The previous [benchmark 
setup|https://issues.apache.org/jira/secure/attachment/12985494/Utf8StringEncodeBench.java]
 was wrong. For the cases of writing with exact size, `reserveAndWriteUtf8` 
should be called to avoid resizing the buffer. 

I have refined the benchmarks and introduced 2 new ones that leverage the 
encodeSize from the previous step. The result shows performance improvement.


{code:java}
     [java] Benchmark                                                  Mode  
Cnt    Score    Error  Units
     [java] Utf8StringEncodeBench.writeLongText                        avgt    
6  571.949 ± 19.791  ns/op
     [java] Utf8StringEncodeBench.writeLongTextWithExactSize           avgt    
6  459.932 ± 27.790  ns/op
     [java] Utf8StringEncodeBench.writeLongTextWithExactSizeSkipCalc   avgt    
6  216.085 ±  3.480  ns/op
     [java] Utf8StringEncodeBench.writeShortText                       avgt    
6   62.775 ±  6.159  ns/op
     [java] Utf8StringEncodeBench.writeShortTextWithExactSize          avgt    
6   44.071 ±  5.645  ns/op
     [java] Utf8StringEncodeBench.writeShortTextWithExactSizeSkipCalc  avgt    
6   36.358 ±  5.135  ns/op

{code}

* writeLongText: the original implementation that calls 
`ByteBufUtils.writeUtf8`. It over-estimates the size of string that causes 
resizing the buffer.
* writeLongTextWithExactSize: calls `TypeSizes.encodeUTF8Length` to reserve the 
exact size of bytes to write.
* writeLongTextWithExactSizeSkipCalc: optimize by removing calculating the UTF8 
length. Because we calculated the encodeSize before encode for messages. 
Therefore, the size of the final bytes is known, we can leverage this 
information to just reserve using the remaining capacity.

 


was (Author: yifanc):
The change of calculating the exact size of UTF-8 string has negative 
performance impact. It needs to iterate through the entire string to determine 
the actual size in UTF-8. 

The [benchmark 
setup|https://issues.apache.org/jira/secure/attachment/12985494/Utf8StringEncodeBench.java]
 and the result:
{code:java}
     [java] Benchmark                                          Mode  Cnt    
Score     Error  Units
     [java] Utf8StringEncodeBench.writeLongText                avgt    6  
552.458 ±   9.141  ns/op
     [java] Utf8StringEncodeBench.writeLongTextWithExactSize   avgt    6  
787.676 ± 120.057  ns/op
     [java] Utf8StringEncodeBench.writeShortText               avgt    6   
70.311 ±   8.031  ns/op
     [java] Utf8StringEncodeBench.writeShortTextWithExactSize  avgt    6   
71.716 ±   4.790  ns/op

{code}
I will revert the change. 

 

> Add CAS “uncertainty” and “contention" messages that are currently propagated 
> as a WriteTimeoutException.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15350
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15350
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Feature/Lightweight Transactions
>            Reporter: Alex Petrov
>            Assignee: Yifan Cai
>            Priority: Normal
>              Labels: protocolv5, pull-request-available
>         Attachments: Utf8StringEncodeBench.java
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Right now, CAS uncertainty introduced in 
> https://issues.apache.org/jira/browse/CASSANDRA-6013 is propagating as 
> WriteTimeout. One of this conditions it manifests is when there’s at least 
> one acceptor that has accepted the value, which means that this value _may_ 
> still get accepted during the later round, despite the proposer failure. 
> Similar problem happens with CAS contention, which is also indistinguishable 
> from the “regular” timeout, even though it is visible in metrics correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to