[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress

Stefania (JIRA) Sun, 16 Oct 2016 21:28:07 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15581172#comment-15581172
 ]


Stefania commented on CASSANDRA-12490:
--------------------------------------

Thanks for the patch Ben. 

Why not simply call {{next.set(seed)}}, or use a {{compareAndSet()}} in a loop 
so that {{next}} can remain final? 

Also, the updated patch changes the meaning of {{SEQ(min..max)}} to generate 
values between {{(min + seed .. min + seed + max)}} rather than {{(min .. 
max)}}. I assume this is what you wanted but the SEQ declaration looks a bit 
misleading now, so I would remove start from the declaration and set it to the 
seed, leaving next to always start at zero, which has the added advantage that 
you don't need to fix {{inverseCumProb()}}. The help section also needs 
rewriting. Lastly, add a unit test case that calls {{setSeed()}} so we can 
catch any other problems I may have missed.

I'm happy to review the code, and to do any commits or rollbacks as required, 
but I don't feel like the most qualified person to discuss the merits of the 
SEQ approach, vs. what was suggested by [~benedict], since I don't have a 
particularly good knowledge of cassandra-stress data generation. So if anyone 
with better knowledge wants to comment further, this is welcome.

> Add sequence distribution type to cassandra stress
> --------------------------------------------------
>
>                 Key: CASSANDRA-12490
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12490
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Ben Slater
>            Assignee: Ben Slater
>            Priority: Minor
>             Fix For: 3.10
>
>         Attachments: 12490-trunk.patch, 12490.yaml, 12490update-trunk.patch, 
> cqlstress-seq-example.yaml
>
>
> When using the write command, cassandra stress sequentially generates seeds. 
> This ensures generated values don't overlap (unless the sequence wraps) 
> providing more predictable number of inserted records (and generating a base 
> set of data without wasted writes).
> When using a yaml stress spec there is no sequenced distribution available. 
> It think it would be useful to have this for doing initial load of data for 
> testing 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress

Reply via email to