Stefania commented on CASSANDRA-12490:

Thanks for the patch Ben. 

Why not simply call {{next.set(seed)}}, or use a {{compareAndSet()}} in a loop 
so that {{next}} can remain final? 

Also, the updated patch changes the meaning of {{SEQ(min..max)}} to generate 
values between {{(min + seed .. min + seed + max)}} rather than {{(min .. 
max)}}. I assume this is what you wanted but the SEQ declaration looks a bit 
misleading now, so I would remove start from the declaration and set it to the 
seed, leaving next to always start at zero, which has the added advantage that 
you don't need to fix {{inverseCumProb()}}. The help section also needs 
rewriting. Lastly, add a unit test case that calls {{setSeed()}} so we can 
catch any other problems I may have missed.

I'm happy to review the code, and to do any commits or rollbacks as required, 
but I don't feel like the most qualified person to discuss the merits of the 
SEQ approach, vs. what was suggested by [~benedict], since I don't have a 
particularly good knowledge of cassandra-stress data generation. So if anyone 
with better knowledge wants to comment further, this is welcome.

> Add sequence distribution type to cassandra stress
> --------------------------------------------------
>                 Key: CASSANDRA-12490
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12490
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Ben Slater
>            Assignee: Ben Slater
>            Priority: Minor
>             Fix For: 3.10
>         Attachments: 12490-trunk.patch, 12490.yaml, 12490update-trunk.patch, 
> cqlstress-seq-example.yaml
> When using the write command, cassandra stress sequentially generates seeds. 
> This ensures generated values don't overlap (unless the sequence wraps) 
> providing more predictable number of inserted records (and generating a base 
> set of data without wasted writes).
> When using a yaml stress spec there is no sequenced distribution available. 
> It think it would be useful to have this for doing initial load of data for 
> testing 

This message was sent by Atlassian JIRA

Reply via email to