[
https://issues.apache.org/jira/browse/CASSANDRA-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273894#comment-14273894
]
Benedict commented on CASSANDRA-8597:
-------------------------------------
There is a FIXED distribution - if you want exactly 1M, why not use this? With
a depth of 3, as stated, FIXED(100) for each clustering column would do this
trick.
If we reenvisage the way we define the distribution, as I alluded to in #2, you
could define the total number of rows you want in the partition. But then
conceptualising how those rows are distributed amongst the clustering columns
becomes hard and a different PITA. You'd need two knobs per clustering column:
the share of fan-out they should adopt, and the variance between each value.
Understanding how these interplayed with each other (both intra-tier and
inter-tier) would be really quite difficult for people to think about, which is
why I originally chose to let it be configured by clustering column. It does,
however, also solve your problem #2. It's a more powerful way of specifying,
but I'm concerned that stress is already considered difficult to understand.
> Stress: make simple things simple
> ---------------------------------
>
> Key: CASSANDRA-8597
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8597
> Project: Cassandra
> Issue Type: Improvement
> Components: Tools
> Reporter: Jonathan Ellis
> Assignee: T Jake Luciani
> Fix For: 2.1.3
>
>
> Some of the trouble people have with stress is a documentation problem, but
> some is functional.
> Comments from [~iamaleksey]:
> # 3 clustering columns, make a million cells in a single partition, should be
> simple, but it's not. have to tweak 'clustering' on the three columns just
> right to make stress work at all. w/ some values it'd just gets stuck forever
> computing batches
> # for others, it generates huge, megabyte-size batches, utterly disrespecting
> 'select' clause in 'insert'
> # I want a sequential generator too, to be able to predict deterministic
> result sets. uniform() only gets you so far
> # impossible to simulate a time series workload
> /cc [~jshook] [~aweisberg] [~benedict]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)