[jira] [Commented] (CASSANDRA-9766) Bootstrap outgoing streaming speeds are much slower than during repair

Stefania (JIRA) Mon, 02 May 2016 22:13:56 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268120#comment-15268120
 ]


Stefania commented on CASSANDRA-9766:
-------------------------------------

It's looking much better without recycling {{BTreeSearchIterator}}:

{code}
grep ERROR 
build/test/logs/TEST-org.apache.cassandra.streaming.LongStreamingTest.log
ERROR [main] 2016-05-03 10:37:04,004 SLF4J: stderr
ERROR [main] 2016-05-03 10:37:34,737 Writer finished after 25 seconds....
ERROR [main] 2016-05-03 10:37:34,738 File : 
/tmp/1462243029050-0/cql_keyspace/table1/ma-1-big-Data.db
ERROR [main] 2016-05-03 10:37:55,165 Finished Streaming in 20.41 seconds: 23.52 
Mb/sec
ERROR [main] 2016-05-03 10:38:15,054 Finished Streaming in 19.89 seconds: 24.14 
Mb/sec
ERROR [main] 2016-05-03 10:38:56,983 Finished Compacting in 41.93 seconds: 
23.09 Mb/sec
{code}

I would suggest leaving {{BTreeSearchIterator}} not recycled. I think it is 
quite dangerous to recycle this iterator, see for example 
[here|https://github.com/apache/cassandra/compare/trunk...tjake:faster-streaming#diff-81fd7ce7915c147ea84590e25f77ca47R361].
 I think we would extend the scope and risk of this patch significantly for 
very little gain but feel free to prove me wrong if you want to experiment with 
alternative recycling options. 

Regarding using our own {{FastThreadLocal}} vs. keeping dependencies to Netty, 
I'm really not sure. On one hand I don't want to cause additional work for no 
good reason and I don't particularly like duplicating code, but on the other 
hand the Netty internal classes, e.g. {{InternalThreadLocalMap}}, could change 
at any time. So we could have performance regressions by upgrading Netty for 
example. I'm happy either way.

Regarding ref. counting, you're quite right we don't need this, if an object is 
not recycled it will be GC-ed.

A few more points:

* Why do we need to allocate cells lazily in {{BTreeRow.Builder}}, do we really 
create many of these without ever adding cells to them?

* 
[{{dob.recycle()}}|https://github.com/apache/cassandra/compare/trunk...tjake:faster-streaming#diff-c06541855022eca5fd794dd24ff02f89R182]
 should be in a finally since {{serializeRowBody()}} can throw.

* I don't understand [this 
line|https://github.com/apache/cassandra/compare/trunk...tjake:faster-streaming#diff-ee37e803d70421ce823d42e02620d589R207]:
 when the object is recycled, the buffer should be null (from close()) and 
indexSamplesSerializedSize should be zero (from create()), so why do we need to 
set {{indexOffsets\[columnIndexCount\] = 0}} explicitly?

* {{ColumnIndex.create()}} is only called in BTW.append. It would be nice if we 
could somehow attach this object somewhere rather than constantly pushing it 
and popping it from the recycler stack. We could just store it in BTW if we 
could be sure that BTW.append is not called by multiple threads or maybe have a 
queue of these objects in BTW?

> Bootstrap outgoing streaming speeds are much slower than during repair
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-9766
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9766
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Streaming and Messaging
>         Environment: Cassandra 2.1.2. more details in the pdf attached 
>            Reporter: Alexei K
>            Assignee: T Jake Luciani
>              Labels: performance
>             Fix For: 3.x
>
>         Attachments: problem.pdf
>
>
> I have a cluster in Amazon cloud , its described in detail in the attachment. 
> What I've noticed is that we during bootstrap we never go above 12MB/sec 
> transmission speeds and also those speeds flat line almost like we're hitting 
> some sort of a limit ( this remains true for other tests that I've ran) 
> however during the repair we see much higher,variable sending rates. I've 
> provided network charts in the attachment as well . Is there an explanation 
> for this? Is something wrong with my configuration, or is it a possible bug?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9766) Bootstrap outgoing streaming speeds are much slower than during repair

Reply via email to