[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress

Alan Boudreault (JIRA) Mon, 26 Sep 2016 06:23:21 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15523039#comment-15523039
 ]


Alan Boudreault commented on CASSANDRA-12490:
---------------------------------------------

[~slater_ben] I've been testing this feature during the weekend. At first 
sight, it worked well and it's very useful. However, I'm experiencing 
unexpected results when I add a third clustered columns. I've attached my yaml 
configuration as a test case. There are some comments in the file but here is a 
brief description of the issue:

cassandra-stress user profile=12490.yaml ops\(insert=1\) n=10 -rate threads=1

{code}
-->  PRIMARY KEY ((stid, year, month), day, hour, minute)
{code}

{code}
 - name: day
    cluster: fixed(30)
    population: seq(1..30)
  - name: hour
    cluster: fixed(24)
    population: seq(1..24) 
  - name: minute
    cluster: fixed(60)
    population: seq(1..60)
{code}

With 3 clustered columns, it looks like only the last one is considered. So, 
with n=10, I got 600 rows total.. when I should have (60*24*30) rows per 
partition. If I remove the minute in the clustering columns, things work as 
expected: 7200 rows total (10*24*30). 



> Add sequence distribution type to cassandra stress
> --------------------------------------------------
>
>                 Key: CASSANDRA-12490
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12490
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Ben Slater
>            Assignee: Ben Slater
>            Priority: Minor
>             Fix For: 3.10
>
>         Attachments: 12490-trunk.patch, 12490.yaml, cqlstress-seq-example.yaml
>
>
> When using the write command, cassandra stress sequentially generates seeds. 
> This ensures generated values don't overlap (unless the sequence wraps) 
> providing more predictable number of inserted records (and generating a base 
> set of data without wasted writes).
> When using a yaml stress spec there is no sequenced distribution available. 
> It think it would be useful to have this for doing initial load of data for 
> testing 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress

Reply via email to