Assuming an even distribution of data in your cluster, and an even
distribution across those keys by your readers, you would not need to
increase RF with cluster size to increase read performance.  If you have 3
nodes with RF=3, and do 3 million reads, with good distribution, each node
has served 1 million read requests.  If you increase to 6 nodes and keep
RF=3, then each node now owns half as much data and serves only 500,000
reads.  Or more meaningfully in the same time it takes to do 3 million
reads under the 3 node cluster you ought to be able to do 6 million reads
under the 6 node cluster since each node is just responsible for 1 million
total reads.

On Mon, Mar 20, 2017 at 11:24 PM Alain Rastoul <alf.mmm....@gmail.com>
wrote:

> On 20/03/2017 22:05, Michael Wojcikiewicz wrote:
> > Not sure if someone has suggested this, but I believe it's not
> > sufficient to simply add nodes to a cluster to increase read
> > performance: you also need to alter the ReplicationFactor of the
> > keyspace to a larger value as you increase your cluster gets larger.
> >
> > ie. data is available from more nodes in the cluster for each query.
> >
> Yes, good point in case of cluster growth, there would be more replica
> to handle same key ranges.
> And also readjust token ranges :
> https://cassandra.apache.org/doc/latest/operating/topo_changes.html
>
> SG, can you give some information (or share your code) about how you
> generate your data and how you read it ?
>
> --
> best,
> Alain
>
>

Reply via email to