RE: StreamCoordinator.ConnectionsPerHost set to 1

2016-06-17 Thread Anubhav Kale
Thanks Paulo. I made some changes along those lines, and seeing good 
improvement. I will discuss further (with a possible patch) on 
https://issues.apache.org/jira/browse/CASSANDRA-4663 (this is for bootstrap, so 
maybe we can repurpose it for rebuilds or create a separate one).

From: Paulo Motta [mailto:pauloricard...@gmail.com]
Sent: Thursday, June 16, 2016 3:06 PM
To: user@cassandra.apache.org
Subject: Re: StreamCoordinator.ConnectionsPerHost set to 1

Increasing the number of threads alone won't help, because you need to add 
connectionsPerHost-awareness to StreamPlan.requestRanges (otherwise only a 
single connection per host is created) similar to what was done to 
StreamPlan.transferFiles by CASSANDRA-3668, but maybe bit trickier. There's an 
open ticket to support that on CASSANDRA-4663
There's also another discussion on improving rebuild parallelism on 
CASSANDRA-12015.

2016-06-16 14:43 GMT-03:00 Anubhav Kale 
<anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>>:
Hello,

I noticed that StreamCoordinator.ConnectionsPerHost is always set to 1 
(Cassandra 2.1.13). If I am reading the code correctly, this means there will 
always be just one socket (well, 2 technically for each direction) between 
nodes when rebuilding thus the data will always be serialized.

Have folks experimented with increasing this ? It appears that some parallelism 
here might help rebuilds in a significant way assuming we aren’t hitting 
bandwidth caps (it’s a pain for us at the moment to rebuild nodes holding 
500GB).

I’m going to try to patch our cluster with a change to test this out, but 
wanted to hear from experts as well.

Thanks !



Re: StreamCoordinator.ConnectionsPerHost set to 1

2016-06-16 Thread Paulo Motta
Increasing the number of threads alone won't help, because you need to add
connectionsPerHost-awareness to StreamPlan.requestRanges (otherwise only a
single connection per host is created) similar to what was done to
StreamPlan.transferFiles by CASSANDRA-3668, but maybe bit trickier. There's
an open ticket to support that on CASSANDRA-4663

There's also another discussion on improving rebuild parallelism on
CASSANDRA-12015.

2016-06-16 14:43 GMT-03:00 Anubhav Kale :

> Hello,
>
>
>
> I noticed that StreamCoordinator.ConnectionsPerHost is always set to 1
> (Cassandra 2.1.13). If I am reading the code correctly, this means there
> will always be just one socket (well, 2 technically for each direction)
> between nodes when rebuilding thus the data will always be serialized.
>
>
>
> Have folks experimented with increasing this ? It appears that some
> parallelism here might help rebuilds in a significant way assuming we
> aren’t hitting bandwidth caps (it’s a pain for us at the moment to rebuild
> nodes holding 500GB).
>
>
>
> I’m going to try to patch our cluster with a change to test this out, but
> wanted to hear from experts as well.
>
>
>
> Thanks !
>


StreamCoordinator.ConnectionsPerHost set to 1

2016-06-16 Thread Anubhav Kale
Hello,

I noticed that StreamCoordinator.ConnectionsPerHost is always set to 1 
(Cassandra 2.1.13). If I am reading the code correctly, this means there will 
always be just one socket (well, 2 technically for each direction) between 
nodes when rebuilding thus the data will always be serialized.

Have folks experimented with increasing this ? It appears that some parallelism 
here might help rebuilds in a significant way assuming we aren't hitting 
bandwidth caps (it's a pain for us at the moment to rebuild nodes holding 
500GB).

I'm going to try to patch our cluster with a change to test this out, but 
wanted to hear from experts as well.

Thanks !