[ 
https://issues.apache.org/jira/browse/CASSANDRA-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Branson updated CASSANDRA-5272:
------------------------------------

    Description: 
For a 12-node EC2 m1.xlarge cluster, restarting a node causes it to get 
completely overloaded with the default 2-thread, 1024KB setting in 1.2.x. This 
seemed to be a smaller problem when it was 6-nodes, but still required us to 
abort handoffs. The old defaults in 1.1.x were WAY more conservative. I've 
dropped this way down to 128KB on our production cluster which is really 
conservative, but appears to have solved it. The default seems way too high on 
any cluster that is non-trivial in size.

After putting some thought to this, it seems that this should really be based 
on cluster size, making the throttle a "target" for how much write load a 
single node can swallow. As the cluster grows, the amount of hints that can be 
delivered by each other node in the cluster goes down, so the throttle should 
self-adjust to take that into account.

  was:
For a 16-node EC2 m1.xlarge cluster, restarting a node causes it to get 
completely overloaded with the default 2-thread, 1024KB setting in 1.2.x. This 
seemed to be a smaller problem when it was 8-nodes, but still required us to 
abort handoffs. The old defaults in 1.1.x were WAY more conservative. I've 
dropped this way down to 128KB on our production cluster which is really 
conservative, but appears to have solved it. The default seems way too high on 
any cluster that is non-trivial in size.

After putting some thought to this, it seems that this should really be based 
on cluster size, making the throttle a "target" for how much write load a 
single node can swallow. As the cluster grows, the amount of hints that can be 
delivered by each other node in the cluster goes down, so the throttle should 
self-adjust to take that into account.

    
> Hinted Handoff Throttle based on cluster size
> ---------------------------------------------
>
>                 Key: CASSANDRA-5272
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5272
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.2.1
>            Reporter: Rick Branson
>
> For a 12-node EC2 m1.xlarge cluster, restarting a node causes it to get 
> completely overloaded with the default 2-thread, 1024KB setting in 1.2.x. 
> This seemed to be a smaller problem when it was 6-nodes, but still required 
> us to abort handoffs. The old defaults in 1.1.x were WAY more conservative. 
> I've dropped this way down to 128KB on our production cluster which is really 
> conservative, but appears to have solved it. The default seems way too high 
> on any cluster that is non-trivial in size.
> After putting some thought to this, it seems that this should really be based 
> on cluster size, making the throttle a "target" for how much write load a 
> single node can swallow. As the cluster grows, the amount of hints that can 
> be delivered by each other node in the cluster goes down, so the throttle 
> should self-adjust to take that into account.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to