Re: nodetool move hammers the next node in the ring

2011-04-09 Thread Jonathan Colby
thanks!  I'll be watching this issue closely.

On Apr 9, 2011, at 5:41 AM, Chris Goffinet wrote:

 We also have a ticket open at 
 
 https://issues.apache.org/jira/browse/CASSANDRA-2399
 
 We have observed in production the impact of streaming data to new nodes 
 being added. We actually have our entire dataset in page cache in one of our 
 clusters, our 99th percentiles go from 20ms to 1 second on streaming nodes 
 when bootstrapping in new nodes because of blowing out the page cache during 
 the process. We are hoping to have this addressed soon. I think throttling of 
 streaming would be good too, to help minimize saturating the network card on 
 the streaming node. Dynamic snitch should help with this, we'll try to report 
 back our results very soon on what it looks like for that case.
  
 -Chris
 
 On Apr 8, 2011, at 7:35 PM, aaron morton wrote:
 
 My brain just started working. The streaming for the move may need to be 
 throttled, but once the file has been received the bloom filters, row 
 indexes and secondary indexes are built. That will also take some effort, do 
 you have any secondary indexes? 
 
 If you are doing a move again could you try turing up logging to DEBUG on 
 one of the neighbour nodes. Once the file has been received you will see a 
 message saying Finished {file_name}. Sending ack to {remote_ip}. After 
 this log message the rebuilds will start,  would be interesting to see what 
 is more heavy weight I'm guessing the rebuilds.
 
 This is similar to https://issues.apache.org/jira/browse/CASSANDRA-2156 but 
 that ticket will not cover this case. I've added this use case to the 
 comments, please check there if you want to follow along.
 
 Cheers
 Aaron
 
 
 On 6 Apr 2011, at 16:26, Jonathan Colby wrote:
 
 thanks for the response Aaron.   Our cluster has 6 nodes with 10 GB load on 
 each.   RF=3.AMD 64 bit Blades, Quad Core, 8 GB ram,  running Debian 
 Linux.  Swap off.  Cassandra 0.7.4
 
 
 On Apr 6, 2011, at 2:40 AM, aaron morton wrote:
 
 Not that I know of, may be useful to be able to throttle things. But if 
 the receiving node has little head room it may still be overwhelmed.
 
 Currently there is a single thread for streaming. If we were to throttle 
 it may be best to make it multi threaded with a single concurrent stream 
 per end point. 
 
 Out of interest how many nodes do you have and whats the RF?
 
 Aaron
 
 
 On 6 Apr 2011, at 01:16, Jonathan Colby wrote:
 
 
 When doing a move, decommission, loadbalance, etc.  data is streamed to 
 the next node in such a way that it really strains the receiving node - 
 to the point where it has a problem serving requests.   
 
 Any way to throttle the streaming of data?
 
 
 
 



Re: nodetool move hammers the next node in the ring

2011-04-08 Thread aaron morton
My brain just started working. The streaming for the move may need to be 
throttled, but once the file has been received the bloom filters, row indexes 
and secondary indexes are built. That will also take some effort, do you have 
any secondary indexes? 

If you are doing a move again could you try turing up logging to DEBUG on one 
of the neighbour nodes. Once the file has been received you will see a message 
saying Finished {file_name}. Sending ack to {remote_ip}. After this log 
message the rebuilds will start,  would be interesting to see what is more 
heavy weight I'm guessing the rebuilds.

This is similar to https://issues.apache.org/jira/browse/CASSANDRA-2156 but 
that ticket will not cover this case. I've added this use case to the comments, 
please check there if you want to follow along.

Cheers
Aaron


On 6 Apr 2011, at 16:26, Jonathan Colby wrote:

 thanks for the response Aaron.   Our cluster has 6 nodes with 10 GB load on 
 each.   RF=3.AMD 64 bit Blades, Quad Core, 8 GB ram,  running Debian 
 Linux.  Swap off.  Cassandra 0.7.4
 
 
 On Apr 6, 2011, at 2:40 AM, aaron morton wrote:
 
 Not that I know of, may be useful to be able to throttle things. But if the 
 receiving node has little head room it may still be overwhelmed.
 
 Currently there is a single thread for streaming. If we were to throttle it 
 may be best to make it multi threaded with a single concurrent stream per 
 end point. 
 
 Out of interest how many nodes do you have and whats the RF?
 
 Aaron
 
 
 On 6 Apr 2011, at 01:16, Jonathan Colby wrote:
 
 
 When doing a move, decommission, loadbalance, etc.  data is streamed to the 
 next node in such a way that it really strains the receiving node - to the 
 point where it has a problem serving requests.   
 
 Any way to throttle the streaming of data?
 
 



Re: nodetool move hammers the next node in the ring

2011-04-08 Thread Chris Goffinet
We also have a ticket open at 

https://issues.apache.org/jira/browse/CASSANDRA-2399

We have observed in production the impact of streaming data to new nodes being 
added. We actually have our entire dataset in page cache in one of our 
clusters, our 99th percentiles go from 20ms to 1 second on streaming nodes 
when bootstrapping in new nodes because of blowing out the page cache during 
the process. We are hoping to have this addressed soon. I think throttling of 
streaming would be good too, to help minimize saturating the network card on 
the streaming node. Dynamic snitch should help with this, we'll try to report 
back our results very soon on what it looks like for that case.
 
-Chris

On Apr 8, 2011, at 7:35 PM, aaron morton wrote:

 My brain just started working. The streaming for the move may need to be 
 throttled, but once the file has been received the bloom filters, row indexes 
 and secondary indexes are built. That will also take some effort, do you have 
 any secondary indexes? 
 
 If you are doing a move again could you try turing up logging to DEBUG on one 
 of the neighbour nodes. Once the file has been received you will see a 
 message saying Finished {file_name}. Sending ack to {remote_ip}. After this 
 log message the rebuilds will start,  would be interesting to see what is 
 more heavy weight I'm guessing the rebuilds.
 
 This is similar to https://issues.apache.org/jira/browse/CASSANDRA-2156 but 
 that ticket will not cover this case. I've added this use case to the 
 comments, please check there if you want to follow along.
 
 Cheers
 Aaron
 
 
 On 6 Apr 2011, at 16:26, Jonathan Colby wrote:
 
 thanks for the response Aaron.   Our cluster has 6 nodes with 10 GB load on 
 each.   RF=3.AMD 64 bit Blades, Quad Core, 8 GB ram,  running Debian 
 Linux.  Swap off.  Cassandra 0.7.4
 
 
 On Apr 6, 2011, at 2:40 AM, aaron morton wrote:
 
 Not that I know of, may be useful to be able to throttle things. But if the 
 receiving node has little head room it may still be overwhelmed.
 
 Currently there is a single thread for streaming. If we were to throttle it 
 may be best to make it multi threaded with a single concurrent stream per 
 end point. 
 
 Out of interest how many nodes do you have and whats the RF?
 
 Aaron
 
 
 On 6 Apr 2011, at 01:16, Jonathan Colby wrote:
 
 
 When doing a move, decommission, loadbalance, etc.  data is streamed to 
 the next node in such a way that it really strains the receiving node - to 
 the point where it has a problem serving requests.   
 
 Any way to throttle the streaming of data?
 
 
 



Re: nodetool move hammers the next node in the ring

2011-04-06 Thread Jonathan Colby
thanks for the response Aaron.   Our cluster has 6 nodes with 10 GB load on 
each.   RF=3.AMD 64 bit Blades, Quad Core, 8 GB ram,  running Debian Linux. 
 Swap off.  Cassandra 0.7.4


On Apr 6, 2011, at 2:40 AM, aaron morton wrote:

 Not that I know of, may be useful to be able to throttle things. But if the 
 receiving node has little head room it may still be overwhelmed.
 
 Currently there is a single thread for streaming. If we were to throttle it 
 may be best to make it multi threaded with a single concurrent stream per end 
 point. 
 
 Out of interest how many nodes do you have and whats the RF?
 
 Aaron
 
 
 On 6 Apr 2011, at 01:16, Jonathan Colby wrote:
 
 
 When doing a move, decommission, loadbalance, etc.  data is streamed to the 
 next node in such a way that it really strains the receiving node - to the 
 point where it has a problem serving requests.   
 
 Any way to throttle the streaming of data?