[jira] [Updated] (CASSANDRA-7980) cassandra-stress should support partial clustering column generation

2016-02-03 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-7980:

Assignee: (was: Branimir Lambov)

> cassandra-stress should support partial clustering column generation
> 
>
> Key: CASSANDRA-7980
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7980
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Benedict
>Priority: Minor
>
> cassandra-stress generates its data randomly, in tiers, so that we can scroll 
> through the partitions it generates without having to generate their 
> entirety. The problem is that to support very large partitions (important for 
> benchmarking certain cases, and acceptance testing) we have to have a large 
> number of clustering columns - generally more than we would otherwise have, 
> which changes the performance characteristics. We should effectively split 
> each clustering column into a number of byte-ranges that become tiers for 
> visitation. The only real complexity here is in obeying the size/count 
> distribution range specified, which would be difficult for exponential 
> distributions, however we could require the user specify the ranges, and 
> distributions for each range, upfront. We could even treat them exactly like 
> other column specifications, but as sub-specs within a given column in the 
> yaml. Or, we could simply accept that we imperfectly follow the distribution 
> in these situations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7980) cassandra-stress should support partial clustering column generation

2015-12-02 Thread Branimir Lambov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-7980:
---
Component/s: Testing

> cassandra-stress should support partial clustering column generation
> 
>
> Key: CASSANDRA-7980
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7980
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Benedict
>Assignee: Branimir Lambov
>Priority: Minor
>
> cassandra-stress generates its data randomly, in tiers, so that we can scroll 
> through the partitions it generates without having to generate their 
> entirety. The problem is that to support very large partitions (important for 
> benchmarking certain cases, and acceptance testing) we have to have a large 
> number of clustering columns - generally more than we would otherwise have, 
> which changes the performance characteristics. We should effectively split 
> each clustering column into a number of byte-ranges that become tiers for 
> visitation. The only real complexity here is in obeying the size/count 
> distribution range specified, which would be difficult for exponential 
> distributions, however we could require the user specify the ranges, and 
> distributions for each range, upfront. We could even treat them exactly like 
> other column specifications, but as sub-specs within a given column in the 
> yaml. Or, we could simply accept that we imperfectly follow the distribution 
> in these situations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7980) cassandra-stress should support partial clustering column generation

2015-01-05 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-7980:

Issue Type: Improvement  (was: Bug)

 cassandra-stress should support partial clustering column generation
 

 Key: CASSANDRA-7980
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7980
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Branimir Lambov
Priority: Minor

 cassandra-stress generates its data randomly, in tiers, so that we can scroll 
 through the partitions it generates without having to generate their 
 entirety. The problem is that to support very large partitions (important for 
 benchmarking certain cases, and acceptance testing) we have to have a large 
 number of clustering columns - generally more than we would otherwise have, 
 which changes the performance characteristics. We should effectively split 
 each clustering column into a number of byte-ranges that become tiers for 
 visitation. The only real complexity here is in obeying the size/count 
 distribution range specified, which would be difficult for exponential 
 distributions, however we could require the user specify the ranges, and 
 distributions for each range, upfront. We could even treat them exactly like 
 other column specifications, but as sub-specs within a given column in the 
 yaml. Or, we could simply accept that we imperfectly follow the distribution 
 in these situations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)