[ 
https://issues.apache.org/jira/browse/CASSANDRA-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16806295#comment-16806295
 ] 

Ben Slater commented on CASSANDRA-11138:
----------------------------------------

All a long time ago now but I suspect CASSANDRA-12744 (which was discussed in 
CASSANDRA-12490) will have helped with this issue but that this is a specific 
issue that still needs fixing. As Alwyn says in the main description, you'd 
expect up to 1200 rows per partition if things were working properly.

CASSANDRA-12490 shouldn't have any impact unless you use the new distribution 
type it added. 

Note that there is also CASSANDRA-13940 which is a better job of 
CASSANDRA-12744 but I just noticed has been left in open state by the 
contributor but is actually patch available.

> cassandra-stress tool - clustering key values not distributed
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-11138
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11138
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Tools
>         Environment: Cassandra 2.2.4, Centos 6.5, Java 8
>            Reporter: Ralf Steppacher
>            Assignee: Alwyn Davis
>            Priority: Normal
>              Labels: stress
>         Attachments: 11138-trunk.patch
>
>
> I am trying to get the stress tool to generate random values for three 
> clustering keys. I am trying to simulate collecting events per user id (text, 
> partition key). Events have a session type (text), event type (text), and 
> creation time (timestamp) (clustering keys, in that order). For testing 
> purposes I ended up with the following column spec:
> {noformat}
> columnspec:
> - name: created_at
>   cluster: uniform(10..10)
> - name: event_type
>   size: uniform(5..10)
>   population: uniform(1..30)
>   cluster: uniform(1..30)
> - name: session_type
>   size: fixed(5)
>   population: uniform(1..4)
>   cluster: uniform(1..4)
> - name: user_id
>   size: fixed(15)
>   population: uniform(1..1000000)
> - name: message
>   size: uniform(10..100)
>   population: uniform(1..100B)
> {noformat}
> My expectation was that this would lead to anywhere between 10 and 1200 rows 
> to be created per partition key. But it seems that exactly 10 rows are being 
> created, with the {{created_at}} timestamp being the only variable that is 
> assigned variable values (per partition key). The {{session_type}} and 
> {{event_type}} variables are assigned fixed values. This is even the case if 
> I set the cluster distribution to uniform(30..30) and uniform(4..4) 
> respectively. With this setting I expected 1200 rows per partition key to be 
> created, as announced when running the stress tool, but it is still 10.
> {noformat}
> [rsteppac@centos bin]$ ./cassandra-stress user 
> profile=../batch_too_large.yaml ops\(insert=1\) -log level=verbose 
> file=~/centos_eventy_patient_session_event_timestamp_insert_only.log -node 
> 10.211.55.8
> …
> Created schema. Sleeping 1s for propagation.
> Generating batches with [1..1] partitions and [1..1] rows (of [1200..1200] 
> total rows in the partitions)
> Improvement over 4 threadCount: 19%
> ...
> {noformat}
> Sample of generated data:
> {noformat}
> cqlsh> select user_id, event_type, session_type, created_at from 
> stresscql.batch_too_large LIMIT 30 ;
> user_id                     | event_type       | session_type | created_at
> -----------------------------+------------------+--------------+--------------------------
>   %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| |     P+|u\x0b | 2012-10-19 
> 08:14:11+0000
>   %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| |     P+|u\x0b | 2004-11-08 
> 04:04:56+0000
>   %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| |     P+|u\x0b | 2002-10-15 
> 00:39:23+0000
>   %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| |     P+|u\x0b | 1999-08-31 
> 19:56:30+0000
>   %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| |     P+|u\x0b | 1999-04-02 
> 20:46:26+0000
>   %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| |     P+|u\x0b | 1990-10-08 
> 03:27:17+0000
>   %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| |     P+|u\x0b | 1984-03-31 
> 23:30:34+0000
>   %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| |     P+|u\x0b | 1975-11-16 
> 02:41:28+0000
>   %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| |     P+|u\x0b | 1970-04-07 
> 07:23:48+0000
>   %\x7f\x03/.d29<i\$u\x114 | Y ?\x1eR|\x13\t| |     P+|u\x0b | 1970-03-08 
> 23:23:04+0000
>      N!\x0eUA7^r7d\x06J<v< |  \x1bm/c/Th\x07U |        E}P^k | 2015-10-12 
> 17:48:51+0000
>      N!\x0eUA7^r7d\x06J<v< |  \x1bm/c/Th\x07U |        E}P^k | 2010-10-28 
> 06:21:13+0000
>      N!\x0eUA7^r7d\x06J<v< |  \x1bm/c/Th\x07U |        E}P^k | 2005-06-28 
> 03:34:41+0000
>      N!\x0eUA7^r7d\x06J<v< |  \x1bm/c/Th\x07U |        E}P^k | 2005-01-29 
> 05:26:21+0000
>      N!\x0eUA7^r7d\x06J<v< |  \x1bm/c/Th\x07U |        E}P^k | 2003-03-27 
> 01:31:24+0000
>      N!\x0eUA7^r7d\x06J<v< |  \x1bm/c/Th\x07U |        E}P^k | 2002-03-29 
> 14:22:43+0000
>      N!\x0eUA7^r7d\x06J<v< |  \x1bm/c/Th\x07U |        E}P^k | 2000-06-15 
> 14:54:29+0000
>      N!\x0eUA7^r7d\x06J<v< |  \x1bm/c/Th\x07U |        E}P^k | 1998-03-08 
> 13:31:54+0000
>      N!\x0eUA7^r7d\x06J<v< |  \x1bm/c/Th\x07U |        E}P^k | 1988-01-21 
> 06:38:40+0000
>      N!\x0eUA7^r7d\x06J<v< |  \x1bm/c/Th\x07U |        E}P^k | 1975-08-03 
> 21:16:47+0000
> oy\x1c0077H"i\x07\x13_%\x06 |    | \nz@Qj\x1cB |        E}P^k | 2014-11-23 
> 17:05:45+0000
> oy\x1c0077H"i\x07\x13_%\x06 |    | \nz@Qj\x1cB |        E}P^k | 2012-02-23 
> 23:20:54+0000
> oy\x1c0077H"i\x07\x13_%\x06 |    | \nz@Qj\x1cB |        E}P^k | 2012-02-19 
> 12:05:15+0000
> oy\x1c0077H"i\x07\x13_%\x06 |    | \nz@Qj\x1cB |        E}P^k | 2005-10-17 
> 04:22:45+0000
> oy\x1c0077H"i\x07\x13_%\x06 |    | \nz@Qj\x1cB |        E}P^k | 2003-02-24 
> 19:45:06+0000
> oy\x1c0077H"i\x07\x13_%\x06 |    | \nz@Qj\x1cB |        E}P^k | 1996-12-18 
> 06:18:31+0000
> oy\x1c0077H"i\x07\x13_%\x06 |    | \nz@Qj\x1cB |        E}P^k | 1991-06-10 
> 22:07:45+0000
> oy\x1c0077H"i\x07\x13_%\x06 |    | \nz@Qj\x1cB |        E}P^k | 1983-05-05 
> 12:29:09+0000
> oy\x1c0077H"i\x07\x13_%\x06 |    | \nz@Qj\x1cB |        E}P^k | 1972-04-17 
> 21:24:52+0000
> oy\x1c0077H"i\x07\x13_%\x06 |    | \nz@Qj\x1cB |        E}P^k | 1971-05-09 
> 23:00:02+0000
> (30 rows)
> cqlsh>
> {noformat}
> If I remove the {{created_at}} clustering key, then the other two clustering 
> keys are being assigned variable values per partition key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to