My brain just started working. The streaming for the move may need to be 
throttled, but once the file has been received the bloom filters, row indexes 
and secondary indexes are built. That will also take some effort, do you have 
any secondary indexes? 

If you are doing a move again could you try turing up logging to DEBUG on one 
of the neighbour nodes. Once the file has been received you will see a message 
saying "Finished {file_name}. Sending ack to {remote_ip}". After this log 
message the rebuilds will start,  would be interesting to see what is more 
heavy weight I'm guessing the rebuilds.

This is similar to https://issues.apache.org/jira/browse/CASSANDRA-2156 but 
that ticket will not cover this case. I've added this use case to the comments, 
please check there if you want to follow along.

Cheers
Aaron


On 6 Apr 2011, at 16:26, Jonathan Colby wrote:

> thanks for the response Aaron.   Our cluster has 6 nodes with 10 GB load on 
> each.   RF=3.    AMD 64 bit Blades, Quad Core, 8 GB ram,  running Debian 
> Linux.  Swap off.  Cassandra 0.7.4
> 
> 
> On Apr 6, 2011, at 2:40 AM, aaron morton wrote:
> 
>> Not that I know of, may be useful to be able to throttle things. But if the 
>> receiving node has little head room it may still be overwhelmed.
>> 
>> Currently there is a single thread for streaming. If we were to throttle it 
>> may be best to make it multi threaded with a single concurrent stream per 
>> end point. 
>> 
>> Out of interest how many nodes do you have and whats the RF?
>> 
>> Aaron
>> 
>> 
>> On 6 Apr 2011, at 01:16, Jonathan Colby wrote:
>> 
>>> 
>>> When doing a move, decommission, loadbalance, etc.  data is streamed to the 
>>> next node in such a way that it really strains the receiving node - to the 
>>> point where it has a problem serving requests.   
>>> 
>>> Any way to throttle the streaming of data?
>> 
> 

Reply via email to