[ 
https://issues.apache.org/jira/browse/CASSANDRA-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-7980:
----------------------------------------
    Assignee:     (was: Branimir Lambov)

> cassandra-stress should support partial clustering column generation
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-7980
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7980
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Testing
>            Reporter: Benedict
>            Priority: Minor
>
> cassandra-stress generates its data randomly, in tiers, so that we can scroll 
> through the partitions it generates without having to generate their 
> entirety. The problem is that to support very large partitions (important for 
> benchmarking certain cases, and acceptance testing) we have to have a large 
> number of clustering columns - generally more than we would otherwise have, 
> which changes the performance characteristics. We should effectively split 
> each clustering column into a number of byte-ranges that become tiers for 
> visitation. The only real complexity here is in obeying the size/count 
> distribution range specified, which would be difficult for exponential 
> distributions, however we could require the user specify the ranges, and 
> distributions for each range, upfront. We could even treat them exactly like 
> other column specifications, but as sub-specs within a given column in the 
> yaml. Or, we could simply accept that we imperfectly follow the distribution 
> in these situations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to