stress.py's multiget option sends increasingly inefficient queries as more test 
data is inserted
------------------------------------------------------------------------------------------------

                 Key: CASSANDRA-1520
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1520
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 0.7 beta 1
            Reporter: Nate McCall
             Fix For: 0.7 beta 2


MultiGetter's key list sizes should be broken up better for more efficient 
queries. Setting an initial value that breaks up the key list into N sub lists 
(where N is the number of threads) yielded more efficient queries. (The choice 
of thread count here was a stop-gap for demonstration purposes. End result 
should probably be chunk-size config option with a sane default).

Pre patch:
---
python stress.py -o multiget -t 25 -n 250000 -c 5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
6,0,6000,8.6109764576,10
10,0,4000,18.6666852832,20
17,0,7000,27.4705835751,30
23,0,6000,36.6091703971,41
25,0,2000,41.8415510654,42

Post patch:
---
python mstress.py -o multiget -t 25 -n 250000 -c 5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
172,17,6880,1.44215503127,10
314,14,5680,1.8667214538,20
466,15,6080,1.69888155084,31
624,15,6320,1.55442555947,41
625,0,40,0.0914790630341,41

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to