Alain RODRIGUEZ created CASSANDRA-9509:
------------------------------------------
Summary: Streams throughput control
Key: CASSANDRA-9509
URL: https://issues.apache.org/jira/browse/CASSANDRA-9509
Project: Cassandra
Issue Type: Improvement
Components: Config
Reporter: Alain RODRIGUEZ
Priority: Minor
Currently, I have to keep tuning stream throughput all the time manually
(through nodetool setstreamthroughput) since the same value stands for example
for a decommission or a removenode (for exemple). The point is in first case
Network goes from 1 --> N nodes (and is obviously limited by the node sending),
in the second it is a N --> N nodes (N being number of remaining nodes).
Removing node, throughput limit will not be reached in most cases, and all the
nodes will be under heavy load. So with the same value of stream throughput, we
send N times faster on a removenode than using decommission.
An other exemple is repair is also faster as more nodes start repairing (we
have 20 nodes, taking 2+ days to repair data, and repair have to run within 10
days, can't be one at the time, and stream throughput needs to be adjusted
accordingly.
Is there a way to:
- limit incoming network on a node ?
- limit cluster wide sent network ?
- make streaming processes background task (using remaining resources) ? This
looks harder to me since the bottleneck depends on the node hardware and the
workload. It can be either the CPU, the network, the disk throughput or even
the memory...
If none of those ideas are doable, can we imagine to dissociate stream
throughputs depending on the operation, to configure them individually in
cassandra.yaml ?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)