Benedict created CASSANDRA-7519:
-----------------------------------

             Summary: Further stress improvements to generate more realistic 
workloads
                 Key: CASSANDRA-7519
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7519
             Project: Cassandra
          Issue Type: Bug
          Components: Tools
            Reporter: Benedict
            Assignee: Benedict
            Priority: Minor
             Fix For: 2.1.0


We generally believe that the most common workload is for reads to 
exponentially prefer most recently written data. However as stress currently 
behaves we have two id generation modes: sequential and random (although random 
can be distributed). I propose introducing a new mode which is somewhat like 
sequential, except we essentially 'look back' from the current id by some 
amount defined by a distribution. I may possibly make the position only 
increment as it's first written to also, so that this mode can be run from a 
clean slate with a mixed workload. This should allow is to generate workloads 
that are more representative.

At the same time, I will introduce a timestamp value generator for primary key 
columns that is strictly ascending, i.e. has some random component but is based 
off of the actual system time (or some shared monotonically increasing state) 
so that we can again generate a more realistic workload. This may be 
challenging to tie in with the new procedurally generated partitions, but I'm 
sure it can be done without too much difficulty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to