Re: How can I scale my read rate?

Avi Kivity Mon, 27 Mar 2017 00:41:26 -0700

Is the driver doing the right thing by directing all reads for a giventoken to the same node? If that node fails, then all of those readswill be directed at other nodes, all oh whom will be cache-cold for thethe failed node's primary token range. Seems like the driver shoulddistribute reads among the all the replicas for a token, at least as anoption, to keep the caches warm for latency-sensitive loads.


On 03/26/2017 07:46 PM, Eric Stevens wrote:

Yes, throughput for a given partition key cannot be improved withhorizontal scaling. You can increase RF to theoretically improvethroughput on that key, but actually in this case smart clients mighthold you back, because they're probably token aware, and will try toserve that read off the key's primary replica, so all reads would bedirected at a single node for that key.
If you're reading at CL=QUORUM, there's a chance that increasing RFwill actually reduce performance rather than improve it, becauseyou've increased the total amount of work to serve the read (as wellas the write). If you're reading at CL=ONE, increasing RF willincrease the chances of falling afoul of eventual consistency.
However that's not really a real-world scenario. Or if it is,Cassandra is probably the wrong tool to satisfy that kind of workload.
On Thu, Mar 23, 2017 at 11:43 PM Alain Rastoul <alf.mmm....@gmail.com<mailto:alf.mmm....@gmail.com>> wrote:
    On 24/03/2017 01:00, Eric Stevens wrote:
    > Assuming an even distribution of data in your cluster, and an even
    > distribution across those keys by your readers, you would not
    need to
    > increase RF with cluster size to increase read performance.  If
    you have
    > 3 nodes with RF=3, and do 3 million reads, with good
    distribution, each
    > node has served 1 million read requests.  If you increase to 6
    nodes and
    > keep RF=3, then each node now owns half as much data and serves only
    > 500,000 reads.  Or more meaningfully in the same time it takes
    to do 3
    > million reads under the 3 node cluster you ought to be able to do 6
    > million reads under the 6 node cluster since each node is just
    > responsible for 1 million total reads.
    >
    Hi Eric,

    I think I got your point.
    In case of really evenly distributed  reads it may (or should?)
    not make
    any difference,

    But when you do not distribute well the reads (and in that case only),
    my understanding about RF was that it could help spreading the load :
    In that case, with RF= 4 instead of 3,  with several clients
    accessing keys
    same key ranges, a coordinator could pick up one node to handle
    the request
    in 4 replicas instead of picking up one node in 3 , thus having
    more "workers" to handle a request ?

    Am I wrong here ?

    Thank you for the clarification


    --
    best,
    Alain

Re: How can I scale my read rate?

Reply via email to