[jira] [Commented] (CASSANDRA-9766) Bootstrap outgoing streaming speeds are much slower than during repair

T Jake Luciani (JIRA) Mon, 02 May 2016 13:49:09 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267478#comment-15267478
 ]


T Jake Luciani commented on CASSANDRA-9766:
-------------------------------------------

bq. Running LongStreamingTest on my laptop went from 24/25 seconds on trunk 
HEAD to 22/23 seconds with the patch applied.

Hmm, looks like the BtreeSearchIterator recycling is causing too high a CPU hit 
to be worth the GC savings.  I've pushed a quick commit which brings the test 
back down to 19 seconds for me, could you try it out and let me what you see? 
Without recycling BTreeSearchIterator accounts for >25% of the heap pressure :(

I think since the object is so hotly used it just causes too much contention on 
the recycler. It's important to avoid too much allocation but seems like in 
this case it's gone too far.  Perhaps we can avoid the recycler here and just 
keep a reusable BTreeSearchIterator in the SSTableWriter. 

bq. I would like to make sure this is justifiable and I would probably want the 
opinion of one more committer with more experience than me
The FastThreadLocal changes were optimization by [~benedict] from a [while 
back|https://github.com/netty/netty/pull/2504] plus some recycler changes.
since we already use netty and it's built to be used as a general library it 
seemed like a good place to start. 

bq. do we have a micro benchmark comparing Netty FastThreadLocal and the JDK 
ThreadLocal? 
The netty FastThreadLocal microbenchmarks show a significant throughput 
increase over jdk

{code}
Benchmark                                    Mode  Cnt      Score      Error  
Units
FastThreadLocalBenchmark.fastThreadLocal    thrpt   20  55452.027 ±  725.713  
ops/s
FastThreadLocalBenchmark.jdkThreadLocalGet  thrpt   20  35481.888 ± 1471.647  
ops/s
{code}

bq. Should we perhaps make recyclable objects ref counted, at least for 
debugging purposes when Ref.DEBUG_ENABLED is true?

The reason I didn't do this and one reason I like the Recycler is it's not 
strictly required to recycle every object. If we added ref counting it would 
force every code path to be properly cleaned up even when we don't care about 
recycling. 



> Bootstrap outgoing streaming speeds are much slower than during repair
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-9766
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9766
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Streaming and Messaging
>         Environment: Cassandra 2.1.2. more details in the pdf attached 
>            Reporter: Alexei K
>            Assignee: T Jake Luciani
>              Labels: performance
>             Fix For: 3.x
>
>         Attachments: problem.pdf
>
>
> I have a cluster in Amazon cloud , its described in detail in the attachment. 
> What I've noticed is that we during bootstrap we never go above 12MB/sec 
> transmission speeds and also those speeds flat line almost like we're hitting 
> some sort of a limit ( this remains true for other tests that I've ran) 
> however during the repair we see much higher,variable sending rates. I've 
> provided network charts in the attachment as well . Is there an explanation 
> for this? Is something wrong with my configuration, or is it a possible bug?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9766) Bootstrap outgoing streaming speeds are much slower than during repair

Reply via email to