[
https://issues.apache.org/jira/browse/CASSANDRA-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sylvain Lebresne updated CASSANDRA-11138:
-----------------------------------------
Labels: stress (was: )
> cassandra-stress tool - clustering key values not distributed
> -------------------------------------------------------------
>
> Key: CASSANDRA-11138
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11138
> Project: Cassandra
> Issue Type: Bug
> Components: Tools
> Environment: Cassandra 2.2.4, Centos 6.5, Java 8
> Reporter: Ralf Steppacher
> Labels: stress
>
> I am trying to get the stress tool to generate random values for three
> clustering keys. I am trying to simulate collecting events per user id (text,
> partition key). Events have a session type (text), event type (text), and
> creation time (timestamp) (clustering keys, in that order). For testing
> purposes I ended up with the following column spec:
> {noformat}
> columnspec:
> - name: created_at
> cluster: uniform(10..10)
> - name: event_type
> size: uniform(5..10)
> population: uniform(1..30)
> cluster: uniform(1..30)
> - name: session_type
> size: fixed(5)
> population: uniform(1..4)
> cluster: uniform(1..4)
> - name: user_id
> size: fixed(15)
> population: uniform(1..1000000)
> - name: message
> size: uniform(10..100)
> population: uniform(1..100B)
> {noformat}
> My expectation was that this would lead to anywhere between 10 and 1200 rows
> to be created per partition key. But it seems that exactly 10 rows are being
> created, with the {{created_at}} timestamp being the only variable that is
> assigned variable values (per partition key). The {{session_type}} and
> {{event_type}} variables are assigned fixed values. This is even the case if
> I set the cluster distribution to uniform(30..30) and uniform(4..4)
> respectively. With this setting I expected 1200 rows per partition key to be
> created, as announced when running the stress tool, but it is still 10.
> {noformat}
> [rsteppac@centos bin]$ ./cassandra-stress user
> profile=../batch_too_large.yaml ops\(insert=1\) -log level=verbose
> file=~/centos_eventy_patient_session_event_timestamp_insert_only.log -node
> 10.211.55.8
> …
> Created schema. Sleeping 1s for propagation.
> Generating batches with [1..1] partitions and [1..1] rows (of [1200..1200]
> total rows in the partitions)
> Improvement over 4 threadCount: 19%
> ...
> {noformat}
> Sample of generated data:
> {noformat}
> cqlsh> select user_id, event_type, session_type, created_at from
> stresscql.batch_too_large LIMIT 30 ;
> user_id | event_type | session_type | created_at
> -----------------------------+------------------+--------------+--------------------------
> %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| | P+|u\x0b | 2012-10-19
> 08:14:11+0000
> %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| | P+|u\x0b | 2004-11-08
> 04:04:56+0000
> %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| | P+|u\x0b | 2002-10-15
> 00:39:23+0000
> %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| | P+|u\x0b | 1999-08-31
> 19:56:30+0000
> %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| | P+|u\x0b | 1999-04-02
> 20:46:26+0000
> %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| | P+|u\x0b | 1990-10-08
> 03:27:17+0000
> %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| | P+|u\x0b | 1984-03-31
> 23:30:34+0000
> %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| | P+|u\x0b | 1975-11-16
> 02:41:28+0000
> %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| | P+|u\x0b | 1970-04-07
> 07:23:48+0000
> %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| | P+|u\x0b | 1970-03-08
> 23:23:04+0000
> N!\x0eUA7^r7d\x06J<v< | \x1bm/c/Th\x07U | E}P^k | 2015-10-12
> 17:48:51+0000
> N!\x0eUA7^r7d\x06J<v< | \x1bm/c/Th\x07U | E}P^k | 2010-10-28
> 06:21:13+0000
> N!\x0eUA7^r7d\x06J<v< | \x1bm/c/Th\x07U | E}P^k | 2005-06-28
> 03:34:41+0000
> N!\x0eUA7^r7d\x06J<v< | \x1bm/c/Th\x07U | E}P^k | 2005-01-29
> 05:26:21+0000
> N!\x0eUA7^r7d\x06J<v< | \x1bm/c/Th\x07U | E}P^k | 2003-03-27
> 01:31:24+0000
> N!\x0eUA7^r7d\x06J<v< | \x1bm/c/Th\x07U | E}P^k | 2002-03-29
> 14:22:43+0000
> N!\x0eUA7^r7d\x06J<v< | \x1bm/c/Th\x07U | E}P^k | 2000-06-15
> 14:54:29+0000
> N!\x0eUA7^r7d\x06J<v< | \x1bm/c/Th\x07U | E}P^k | 1998-03-08
> 13:31:54+0000
> N!\x0eUA7^r7d\x06J<v< | \x1bm/c/Th\x07U | E}P^k | 1988-01-21
> 06:38:40+0000
> N!\x0eUA7^r7d\x06J<v< | \x1bm/c/Th\x07U | E}P^k | 1975-08-03
> 21:16:47+0000
> oy\x1c0077H"i\x07\x13_%\x06 | | \nz@Qj\x1cB | E}P^k | 2014-11-23
> 17:05:45+0000
> oy\x1c0077H"i\x07\x13_%\x06 | | \nz@Qj\x1cB | E}P^k | 2012-02-23
> 23:20:54+0000
> oy\x1c0077H"i\x07\x13_%\x06 | | \nz@Qj\x1cB | E}P^k | 2012-02-19
> 12:05:15+0000
> oy\x1c0077H"i\x07\x13_%\x06 | | \nz@Qj\x1cB | E}P^k | 2005-10-17
> 04:22:45+0000
> oy\x1c0077H"i\x07\x13_%\x06 | | \nz@Qj\x1cB | E}P^k | 2003-02-24
> 19:45:06+0000
> oy\x1c0077H"i\x07\x13_%\x06 | | \nz@Qj\x1cB | E}P^k | 1996-12-18
> 06:18:31+0000
> oy\x1c0077H"i\x07\x13_%\x06 | | \nz@Qj\x1cB | E}P^k | 1991-06-10
> 22:07:45+0000
> oy\x1c0077H"i\x07\x13_%\x06 | | \nz@Qj\x1cB | E}P^k | 1983-05-05
> 12:29:09+0000
> oy\x1c0077H"i\x07\x13_%\x06 | | \nz@Qj\x1cB | E}P^k | 1972-04-17
> 21:24:52+0000
> oy\x1c0077H"i\x07\x13_%\x06 | | \nz@Qj\x1cB | E}P^k | 1971-05-09
> 23:00:02+0000
> (30 rows)
> cqlsh>
> {noformat}
> If I remove the {{created_at}} clustering key, then the other two clustering
> keys are being assigned variable values per partition key.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)