[jira] [Created] (CASSANDRA-18207) Nodetool documentation not working
Ben Slater created CASSANDRA-18207: -- Summary: Nodetool documentation not working Key: CASSANDRA-18207 URL: https://issues.apache.org/jira/browse/CASSANDRA-18207 Project: Cassandra Issue Type: Bug Components: Documentation Reporter: Ben Slater Clicking on the nodetool in the Tool branch of the doco tree doesn't do anything. Also, links found in Google search (eg [https://cassandra.apache.org/doc/latest/cassandra/tools/nodetool/tablehistograms.html)] result in "Not Found" error. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11138) cassandra-stress tool - clustering key values not distributed
[ https://issues.apache.org/jira/browse/CASSANDRA-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806295#comment-16806295 ] Ben Slater commented on CASSANDRA-11138: All a long time ago now but I suspect CASSANDRA-12744 (which was discussed in CASSANDRA-12490) will have helped with this issue but that this is a specific issue that still needs fixing. As Alwyn says in the main description, you'd expect up to 1200 rows per partition if things were working properly. CASSANDRA-12490 shouldn't have any impact unless you use the new distribution type it added. Note that there is also CASSANDRA-13940 which is a better job of CASSANDRA-12744 but I just noticed has been left in open state by the contributor but is actually patch available. > cassandra-stress tool - clustering key values not distributed > - > > Key: CASSANDRA-11138 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11138 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Tools > Environment: Cassandra 2.2.4, Centos 6.5, Java 8 >Reporter: Ralf Steppacher >Assignee: Alwyn Davis >Priority: Normal > Labels: stress > Attachments: 11138-trunk.patch > > > I am trying to get the stress tool to generate random values for three > clustering keys. I am trying to simulate collecting events per user id (text, > partition key). Events have a session type (text), event type (text), and > creation time (timestamp) (clustering keys, in that order). For testing > purposes I ended up with the following column spec: > {noformat} > columnspec: > - name: created_at > cluster: uniform(10..10) > - name: event_type > size: uniform(5..10) > population: uniform(1..30) > cluster: uniform(1..30) > - name: session_type > size: fixed(5) > population: uniform(1..4) > cluster: uniform(1..4) > - name: user_id > size: fixed(15) > population: uniform(1..100) > - name: message > size: uniform(10..100) > population: uniform(1..100B) > {noformat} > My expectation was that this would lead to anywhere between 10 and 1200 rows > to be created per partition key. But it seems that exactly 10 rows are being > created, with the {{created_at}} timestamp being the only variable that is > assigned variable values (per partition key). The {{session_type}} and > {{event_type}} variables are assigned fixed values. This is even the case if > I set the cluster distribution to uniform(30..30) and uniform(4..4) > respectively. With this setting I expected 1200 rows per partition key to be > created, as announced when running the stress tool, but it is still 10. > {noformat} > [rsteppac@centos bin]$ ./cassandra-stress user > profile=../batch_too_large.yaml ops\(insert=1\) -log level=verbose > file=~/centos_eventy_patient_session_event_timestamp_insert_only.log -node > 10.211.55.8 > … > Created schema. Sleeping 1s for propagation. > Generating batches with [1..1] partitions and [1..1] rows (of [1200..1200] > total rows in the partitions) > Improvement over 4 threadCount: 19% > ... > {noformat} > Sample of generated data: > {noformat} > cqlsh> select user_id, event_type, session_type, created_at from > stresscql.batch_too_large LIMIT 30 ; > user_id | event_type | session_type | created_at > -+--+--+-- > %\x7f\x03/.d29 08:14:11+ > %\x7f\x03/.d29 04:04:56+ > %\x7f\x03/.d29 00:39:23+ > %\x7f\x03/.d29 19:56:30+ > %\x7f\x03/.d29 20:46:26+ > %\x7f\x03/.d29 03:27:17+ > %\x7f\x03/.d29 23:30:34+ > %\x7f\x03/.d29 02:41:28+ > %\x7f\x03/.d29 07:23:48+ > %\x7f\x03/.d29 23:23:04+ > N!\x0eUA7^r7d\x06J 17:48:51+ > N!\x0eUA7^r7d\x06J 06:21:13+ > N!\x0eUA7^r7d\x06J 03:34:41+ > N!\x0eUA7^r7d\x06J 05:26:21+ > N!\x0eUA7^r7d\x06J 01:31:24+ > N!\x0eUA7^r7d\x06J 14:22:43+ > N!\x0eUA7^r7d\x06J 14:54:29+ > N!\x0eUA7^r7d\x06J 13:31:54+ > N!\x0eUA7^r7d\x06J 06:38:40+ > N!\x0eUA7^r7d\x06J 21:16:47+ > oy\x1c0077H"i\x07\x13_%\x06 || \nz@Qj\x1cB |E}P^k | 2014-11-23 > 17:05:45+ > oy\x1c0077H"i\x07\x13_%\x06 || \nz@Qj\x1cB |E}P^k | 2012-02-23 > 23:20:54+ > oy\x1c0077H"i\x07\x13_%\x06 || \nz@Qj\x1cB |E}P^k | 2012-02-19 > 12:05:15+ > oy\x1c0077H"i\x07\x13_%\x06 || \nz@Qj\x1cB |E}P^k | 2005-10-17 > 04:22:45+ > oy\x1c0077H"i\x07\x13_%\x06 || \nz@Qj\x1cB |E}P^k | 2003-02-24 > 19:45:06+ > oy\x1c0077H"i\x07\x13_%\x06 || \nz@Qj\x1cB |E}P^k | 1996-12-18 > 06:18:31+ > oy\x1c0077H"i\x07\x13_%\x06 || \nz@Qj\x1cB |E}P^k | 1991-06-10 > 22:07:45+ >
[jira] [Commented] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414966#comment-16414966 ] Ben Slater commented on CASSANDRA-8460: --- OK. Talking to Lerh, his code his just about at the point where we can do some initial benchmarking so we'll run some tests to compare the two approaches and report what we get. > Make it possible to move non-compacting sstables to slow/big storage in DTCS > > > Key: CASSANDRA-8460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8460 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Lerh Chuan Low >Priority: Major > Labels: doc-impacting, dtcs > Fix For: 4.x > > > It would be nice if we could configure DTCS to have a set of extra data > directories where we move the sstables once they are older than > max_sstable_age_days. > This would enable users to have a quick, small SSD for hot, new data, and big > spinning disks for data that is rarely read and never compacted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414793#comment-16414793 ] Ben Slater commented on CASSANDRA-8460: --- Fair enough - the example was a bit of an oversimplification even for how I would have guessed it work. Having read up a bit ([https://www.redhat.com/en/blog/improving-read-performance-dm-cache)] and ([https://www.kernel.org/doc/Documentation/device-mapper/cache-policies.txt)] I suspect we've actually got a bit of a different model of the use cases we are both imagining (and I haven't done a great job of describing what I have in mind). Consider you're building an IOT application that collects sensor data and has some kind of UI for displaying readings. You want to be able to provide an experience for your users where accessing today's data (the most common use) is snappy while still providing the ability to go back in time a year but as it's not common it's fine for access to that data to be slower. In this scenario the recent data isn't "hot" in the sense that it is accessed many times (I'm not sure there is a well defined term for what it is - maybe "high priority" is better?) so it's hard for a caching algorithm (like smq) based on frequency of access to work effectively (in fact the first access is the one you want to be fast). Does that make more sense as to where I'm coming from? > Make it possible to move non-compacting sstables to slow/big storage in DTCS > > > Key: CASSANDRA-8460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8460 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Lerh Chuan Low >Priority: Major > Labels: doc-impacting, dtcs > Fix For: 4.x > > > It would be nice if we could configure DTCS to have a set of extra data > directories where we move the sstables once they are older than > max_sstable_age_days. > This would enable users to have a quick, small SSD for hot, new data, and big > spinning disks for data that is rarely read and never compacted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414766#comment-16414766 ] Ben Slater commented on CASSANDRA-8460: --- I'm not sure it's necessarily easier (because you now have two separate pools of disk to manage) but I think it is more predictable - your data will be always be on the fast disk until it reaches the age you specify. With LVM (possibly depending on it's rules about how and when to cache - I admit I don't know a lot about tuning possibilities there) you could end up with issues like one of your users decides to do some analysis/extract a heap old data and ends up evicting the recent data from your cache and cause what you expected to be hot data to slow down. > Make it possible to move non-compacting sstables to slow/big storage in DTCS > > > Key: CASSANDRA-8460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8460 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Lerh Chuan Low >Priority: Major > Labels: doc-impacting, dtcs > Fix For: 4.x > > > It would be nice if we could configure DTCS to have a set of extra data > directories where we move the sstables once they are older than > max_sstable_age_days. > This would enable users to have a quick, small SSD for hot, new data, and big > spinning disks for data that is rarely read and never compacted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414719#comment-16414719 ] Ben Slater commented on CASSANDRA-8460: --- Think some more about this I think the other (and perhaps most important) advantage of implementing in Cassandra is predictability for operators. It's easy to say, for example, if I want data < 1 month old to be fast 1 need enough fast disk space for that and I know it will be consistently fast after that I need X disk space for the older data and I know it will be slower (and can even clearly tell users that). Trying to tune performance of the hot data (and avoid latency spikes) with with Cassandra + LVM sounds pretty hard. > Make it possible to move non-compacting sstables to slow/big storage in DTCS > > > Key: CASSANDRA-8460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8460 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Lerh Chuan Low >Priority: Major > Labels: doc-impacting, dtcs > Fix For: 4.x > > > It would be nice if we could configure DTCS to have a set of extra data > directories where we move the sstables once they are older than > max_sstable_age_days. > This would enable users to have a quick, small SSD for hot, new data, and big > spinning disks for data that is rarely read and never compacted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414719#comment-16414719 ] Ben Slater edited comment on CASSANDRA-8460 at 3/26/18 10:55 PM: - Thinking some more about this I think the other (and perhaps most important) advantage of implementing in Cassandra is predictability for operators. It's easy to say, for example, if I want data < 1 month old to be fast 1 need enough fast disk space for that and I know it will be consistently fast after that I need X disk space for the older data and I know it will be slower (and can even clearly tell users that). Trying to tune performance of the hot data (and avoid latency spikes) with with Cassandra + LVM sounds pretty hard. was (Author: slater_ben): Think some more about this I think the other (and perhaps most important) advantage of implementing in Cassandra is predictability for operators. It's easy to say, for example, if I want data < 1 month old to be fast 1 need enough fast disk space for that and I know it will be consistently fast after that I need X disk space for the older data and I know it will be slower (and can even clearly tell users that). Trying to tune performance of the hot data (and avoid latency spikes) with with Cassandra + LVM sounds pretty hard. > Make it possible to move non-compacting sstables to slow/big storage in DTCS > > > Key: CASSANDRA-8460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8460 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Lerh Chuan Low >Priority: Major > Labels: doc-impacting, dtcs > Fix For: 4.x > > > It would be nice if we could configure DTCS to have a set of extra data > directories where we move the sstables once they are older than > max_sstable_age_days. > This would enable users to have a quick, small SSD for hot, new data, and big > spinning disks for data that is rarely read and never compacted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414547#comment-16414547 ] Ben Slater commented on CASSANDRA-8460: --- Hi John I've been setting the requirements from our (Instaclustr) point of view for Lerh here so I thought I'd weigh in on why I'd rather see a Cassandra based solution than LVM. The requirement we're looking to target, as per the original JIRA, is people who have data that is hot for a short period but then they need to keep around for a long time with infrequent access (ie well defined rules on hot vs cold, not deciding what is hot based on what was recently read). Typically when I've seen this requirement people want: 1) The best possible performance for the hot data 2) Lowest cost of storage for the cold data It seems to me that with LVM we're a not doing the best we could in terms of either of these. For performance, there is the write-through slow down you mentioned, depending on where you draw the line on moving to slow disk vs the final TWCS compaction you might have compactions pushing data you want to be quick out of cache and if you used EBS for both the hot disk and the slow disk you are increasing usage of the EBS bandwidth to copy to and from cache (although using local SSD as the cache negates this last one). In terms of cost, with LVM the fast disk is purely being used as cache rather than a primary store so you are having to duplicate that amount of data storage - whether that is significant probably depends on your desired ratio of fast to slow disk and how cost sensitive you are. Whether this downsides are worth the extra complexity is of course a matter of judgement rather than facts so happy to go with the community consensus here but thought I'd put in my POV. Cheers Ben > Make it possible to move non-compacting sstables to slow/big storage in DTCS > > > Key: CASSANDRA-8460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8460 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Lerh Chuan Low >Priority: Major > Labels: doc-impacting, dtcs > Fix For: 4.x > > > It would be nice if we could configure DTCS to have a set of extra data > directories where we move the sstables once they are older than > max_sstable_age_days. > This would enable users to have a quick, small SSD for hot, new data, and big > spinning disks for data that is rarely read and never compacted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13940) Fix stress seed multiplier
[ https://issues.apache.org/jira/browse/CASSANDRA-13940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196081#comment-16196081 ] Ben Slater commented on CASSANDRA-13940: Just got a chance to take a look at this and I agree this is a much better/cleaner implementation than mine. Looking back at it I think the different multiplier values was only really necessary for the very small runs that I was initially testing on but are probably too small to be particularly relevant. I rechecked a few of my more realistic tests with the static multiplier and results were pretty similar. So, no objection from me to this. Maybe @tjake can review as he reviewed the initial patch? > Fix stress seed multiplier > -- > > Key: CASSANDRA-13940 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13940 > Project: Cassandra > Issue Type: Bug > Components: Stress >Reporter: Daniel Cranford > Attachments: 0001-Fixing-seed-multiplier.patch > > > CASSANDRA-12744 attempted to fix a problem with partition key generation, but > is generally broken. E.G. > {noformat} > cassandra-stress -insert visits=fixed\(100\) revisit=uniform\(1..100\) ... > {noformat} > sends cassandra-stress into an infinite loop. Here's a better fix. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043871#comment-16043871 ] Ben Slater commented on CASSANDRA-12744: One extra note for future searching: There is a fair chance this fix will change the workload quite substantially in a number of scenarios. So, if you want to compare benchmarks make sure you don't compare results from stress with this fix vs stress without this fix. > Randomness of stress distributions is not good > -- > > Key: CASSANDRA-12744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12744 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: Ben Slater >Priority: Minor > Labels: stress > Fix For: 4.0 > > Attachments: CASSANDRA_12744_SeedManager_changes-trunk.patch > > > The randomness of our distributions is pretty bad. We are using the > JDKRandomGenerator() but in testing of uniform(1..3) we see for 100 > iterations it's only outputting 3. If you bump it to 10k it hits all 3 > values. > I made a change to just use the default commons math random generator and now > see all 3 values for n=10 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16038091#comment-16038091 ] Ben Slater commented on CASSANDRA-12744: Looks like the tests failures are unrelated? Are we OK to commit? > Randomness of stress distributions is not good > -- > > Key: CASSANDRA-12744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12744 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: Ben Slater >Priority: Minor > Labels: stress > Fix For: 4.0 > > Attachments: CASSANDRA_12744_SeedManager_changes-trunk.patch > > > The randomness of our distributions is pretty bad. We are using the > JDKRandomGenerator() but in testing of uniform(1..3) we see for 100 > iterations it's only outputting 3. If you bump it to 10k it hits all 3 > values. > I made a change to just use the default commons math random generator and now > see all 3 values for n=10 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16028803#comment-16028803 ] Ben Slater edited comment on CASSANDRA-12744 at 5/30/17 7:18 AM: - After some more digging, I've come to the conclusion that the issue is that the JDKRandomGenerator creates close random numbers when seeded with close values. So, when running with a small range of potential seeds (from the population) you end up with different random doubles which all round to the same long value. The attached patch multiplies the generated seed so that max seed values are of the order of 10^22. I've tested this against a couple of the failed dtests and pass OK. In addition, I get the following results from a range of YAML files (without multiplier result is unmodified trunk, with multiplier is with this patch applied): Example 1: table: test5 table_definition: | CREATE TABLE test5 ( pk int, val text, PRIMARY KEY (pk) ) columnspec: - name: pk size: fixed(64) population: uniform(1..500) user profile=... ops(insert=1) n=1000 cl=ALL no-warmup -rate threads=5 -node 127.0.0.1 without multiplier - 47 rows with multiplier - 490 rows table: test4 table_definition: | CREATE TABLE test4 ( pk int, pk2 text, val text, PRIMARY KEY ((pk,pk2)) ) columnspec: - name: pk size: fixed(2) population: uniform(1..5) - name: pk2 size: fixed(2) population: uniform(1..5) user profile=... ops(insert=1) n=1000 cl=ALL no-warmup -rate threads=5 -node 127.0.0.1 without multipler - 1 row with multiplier - 25 rows table: test4 table_definition: | CREATE TABLE test4 ( pk int, pk2 text, val text, PRIMARY KEY ((pk,pk2)) ) columnspec: - name: pk size: fixed(2) population: uniform(1..500M) - name: pk2 size: fixed(2) population: uniform(1..5) user profile=... ops(insert=1) n=1000 cl=ALL no-warmup -rate threads=5 -node 127.0.0.1 without multipler - 1000 row with multiplier - 1000 rows === table: test7 table_definition: | CREATE TABLE test7 ( pk int, pk2 text, ck1 text, val text, PRIMARY KEY ((pk,pk2), ck1) ) columnspec: - name: pk size: fixed(2) population: uniform(1..100) - name: pk2 size: fixed(4) population: uniform(1..1) - name: pk2 size: fixed(4) population: uniform(1..1000) user profile=... ops(insert=1) n=10 cl=ALL no-warmup -rate threads=5 -node 127.0.0.1 without multipler - 10342 row with multiplier - 63387 rows = table_definition: | CREATE TABLE test7 ( pk int, pk2 text, ck1 text, val text, PRIMARY KEY ((pk,pk2), ck1) ) columnspec: - name: pk size: fixed(4) population: seq(1..100) - name: pk2 size: fixed(10) population: seq(1..1) - name: pk2 size: fixed(10) cluster: uniform(1..1000) population: seq(1..1000) user profile=... ops(insert=1) n=10 cl=ALL no-warmup -rate threads=5 -node 127.0.0.1 without multiplier - 25000 row with multiplier - 43304 rows was (Author: slater_ben): After some more digging, I've come to the conclusion that the issue is that the JDKRandomGenerator creates close random numbers when seeded with close values. So, when running with a small range of potential seeds (from the population) you end up with different random doubles which all round to the same long value. The attached patch multiplies the generated seed so that max seed values are of the order of 10^22. I've tested this against a couple of the failed dtests and pass OK. In addition, I get the following results from a range of YAML files: Example 1: table: test5 table_definition: | CREATE TABLE test5 ( pk int, val text, PRIMARY KEY (pk) ) columnspec: - name: pk size: fixed(64) population: uniform(1..500) user profile=... ops(insert=1) n=1000 cl=ALL no-warmup -rate threads=5 -node 127.0.0.1 without multiplier - 47 rows with multiplier - 490 rows table: test4 table_definition: | CREATE TABLE test4 ( pk int, pk2 text, val text, PRIMARY KEY ((pk,pk2)) ) columnspec: - name: pk size: fixed(2) population: uniform(1..5) - name: pk2 size: fixed(2) population: uniform(1..5) user profile=... ops(insert=1) n=1000 cl=ALL no-warmup -rate threads=5 -node 127.0.0.1 without multipler - 1 row with multiplier - 25 rows table: test4 table_definition: | CREATE TABLE test4 ( pk int, pk2 text, val text, PRIMARY KEY ((pk,pk2)) )
[jira] [Updated] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12744: --- Attachment: CASSANDRA_12744_SeedManager_changes-trunk.patch > Randomness of stress distributions is not good > -- > > Key: CASSANDRA-12744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12744 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: Ben Slater >Priority: Minor > Labels: stress > Fix For: 4.0 > > Attachments: CASSANDRA_12744_SeedManager_changes-trunk.patch > > > The randomness of our distributions is pretty bad. We are using the > JDKRandomGenerator() but in testing of uniform(1..3) we see for 100 > iterations it's only outputting 3. If you bump it to 10k it hits all 3 > values. > I made a change to just use the default commons math random generator and now > see all 3 values for n=10 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12744: --- Reviewer: T Jake Luciani Status: Patch Available (was: Open) After some more digging, I've come to the conclusion that the issue is that the JDKRandomGenerator creates close random numbers when seeded with close values. So, when running with a small range of potential seeds (from the population) you end up with different random doubles which all round to the same long value. The attached patch multiplies the generated seed so that max seed values are of the order of 10^22. I've tested this against a couple of the failed dtests and pass OK. In addition, I get the following results from a range of YAML files: Example 1: table: test5 table_definition: | CREATE TABLE test5 ( pk int, val text, PRIMARY KEY (pk) ) columnspec: - name: pk size: fixed(64) population: uniform(1..500) user profile=... ops(insert=1) n=1000 cl=ALL no-warmup -rate threads=5 -node 127.0.0.1 without multiplier - 47 rows with multiplier - 490 rows table: test4 table_definition: | CREATE TABLE test4 ( pk int, pk2 text, val text, PRIMARY KEY ((pk,pk2)) ) columnspec: - name: pk size: fixed(2) population: uniform(1..5) - name: pk2 size: fixed(2) population: uniform(1..5) user profile=... ops(insert=1) n=1000 cl=ALL no-warmup -rate threads=5 -node 127.0.0.1 without multipler - 1 row with multiplier - 25 rows table: test4 table_definition: | CREATE TABLE test4 ( pk int, pk2 text, val text, PRIMARY KEY ((pk,pk2)) ) columnspec: - name: pk size: fixed(2) population: uniform(1..500M) - name: pk2 size: fixed(2) population: uniform(1..5) user profile=... ops(insert=1) n=1000 cl=ALL no-warmup -rate threads=5 -node 127.0.0.1 without multipler - 1000 row with multiplier - 1000 rows === table: test7 table_definition: | CREATE TABLE test7 ( pk int, pk2 text, ck1 text, val text, PRIMARY KEY ((pk,pk2), ck1) ) columnspec: - name: pk size: fixed(2) population: uniform(1..100) - name: pk2 size: fixed(4) population: uniform(1..1) - name: pk2 size: fixed(4) population: uniform(1..1000) user profile=... ops(insert=1) n=10 cl=ALL no-warmup -rate threads=5 -node 127.0.0.1 without multipler - 10342 row with multiplier - 63387 rows = table_definition: | CREATE TABLE test7 ( pk int, pk2 text, ck1 text, val text, PRIMARY KEY ((pk,pk2), ck1) ) columnspec: - name: pk size: fixed(4) population: seq(1..100) - name: pk2 size: fixed(10) population: seq(1..1) - name: pk2 size: fixed(10) cluster: uniform(1..1000) population: seq(1..1000) user profile=... ops(insert=1) n=10 cl=ALL no-warmup -rate threads=5 -node 127.0.0.1 without multiplier - 25000 row with multiplier - 43304 rows > Randomness of stress distributions is not good > -- > > Key: CASSANDRA-12744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12744 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: Ben Slater >Priority: Minor > Labels: stress > Fix For: 4.0 > > > The randomness of our distributions is pretty bad. We are using the > JDKRandomGenerator() but in testing of uniform(1..3) we see for 100 > iterations it's only outputting 3. If you bump it to 10k it hits all 3 > values. > I made a change to just use the default commons math random generator and now > see all 3 values for n=10 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027755#comment-16027755 ] Ben Slater commented on CASSANDRA-12744: Actually, I think it's a bit more complex than I just said but still think it's related to the interaction between the population distribution and the individual column distributions. Just tried 10,000 inserts with -pop dist=uniform(1..25) and the following YAML and only get 1 row inserted. table_definition: | CREATE TABLE test4 ( pk text, pk2 text, val text, PRIMARY KEY ((pk,pk2)) ) columnspec: - name: pk size: fixed(2) population: exp(1..5) - name: pk2 size: fixed(2) population: exp(1..5) Running with -pop dist=uniform(1..10B) gives the expected 25 rows so it may be as simple as just setting a really big default population when running in user mode but I'll do a bit more digging. > Randomness of stress distributions is not good > -- > > Key: CASSANDRA-12744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12744 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: Ben Slater >Priority: Minor > Labels: stress > Fix For: 4.0 > > > The randomness of our distributions is pretty bad. We are using the > JDKRandomGenerator() but in testing of uniform(1..3) we see for 100 > iterations it's only outputting 3. If you bump it to 10k it hits all 3 > values. > I made a change to just use the default commons math random generator and now > see all 3 values for n=10 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027722#comment-16027722 ] Ben Slater edited comment on CASSANDRA-12744 at 5/28/17 8:04 AM: - So I took a look into this with the following findings: 1) The dtest is broken because it assumes that when you when c*-stress with n=1 you will end up with 10,000 rows inserted when I think the actual functional guarantee is that it will run 10,000 insert operations. 2) However, with the JDKRandomGenerator is assumption hold up to a few hundred thousand records. Even with n=1M you end up with 999,999 records in the table. For some reason, change to the library default Well19937c generator means no only is the assumption broken at n=10k but seem to get proportional worse as n increases. So, on those findings, I don't think changing the generator is a good idea. So, I tried to dig a bit deeper about what was causing the issue. As part of this, I wrote some code to generate values directly from the distributions in various ways and the results all seemed as expected (ie reasonably aligned with the distribution type). After a bit more digging, and to cut a long story short, I found that the actual is related to the -pop setting. I'm still a bit hazy on this but it seems -pop is the distribution of all possible keys. So, if I have a -pop of dist(1..10) I can only have 10 possible key values (ie combinations across all columns) no matter what the ranges specified for the key column in the YAML file are. The default for -pop is UNIFORM(1..n) where n is specified or 1..1,000,000 where no n is specified. I think this all results in somewhat counter-intuitive results, particular with multi-part keys. So, I think the actual answer here is to change the rules for the default -pop for yaml runs to have a population size equal to the product of the population size of each key as specified in the YAML. For example, if I have two columns: partition_key UNIFORM(1..1M) cluster_key UNIFORM(1..100) then the default population should be 1..100M. I think this is already implied by the YAML and what people would expect (certainly what I expected). I've done a few tests manual setting the pop and it seems to do what's expected. I don't think this change will be too hard to make but interested to hear if anyone has an opinions before I jump into it. was (Author: slater_ben): So I took a look into this with the following findings: 1) The dtest is broken because it assumes that when you when c*-stress with n=1 you will end up with 10,000 rows inserted when I think the actual functional guarantee is that it will run 10,000 insert operations. 2) However, with the JDKRandomGenerator is assumption hold up to a few hundred thousand records. Even with n=1M you end up with 999,999 records in the table. For some reason, change to the library default Well19937c generator means no only is the assumption broken at n=10k but seem to get proportional worse as n increases. So, on those findings, I don't think changing the generator is a good idea. So, I tried to dig a bit deeper about what was causing the issue. As part of this, I wrote some code to generate values directly from the distributions in various ways and the results all seemed as expected (ie reasonably aligned with the distribution type). After a bit more digging, and to cut a long story short, I found that the actual is related to the -pop setting. I'm still a bit hazy on this but it seems -pop is the distribution of all possible keys. So, if I have a -pop of dist(1..10) I can only have 10 possible key values (ie combinations across all columns) no matter what the ranges specified for the key column in the YAML file are. The default for -pop is UNIFORM(1..n) where n is specified or 1..1,000,000 where no n is specified. I think this all results in somewhat counter-intuitive results, particular with multi-part keys. So, I think the actual answer here is to change the rules for the default -pop for yaml runs to have a population size equal to the product of the population size of each key as specified in the YAML. For example, if I have two columns: partition_key UNIFORM(1..1M) cluster_key UNIFORM(1..100) then the default population should be 1..100M. I think this is already implied by the YAML and what people would expect (certainly what I expected). I don't think this change will be too hard to make but interested to hear if anyone has an opinions before I jump into it. > Randomness of stress distributions is not good > -- > > Key: CASSANDRA-12744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12744 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >
[jira] [Comment Edited] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027722#comment-16027722 ] Ben Slater edited comment on CASSANDRA-12744 at 5/28/17 6:49 AM: - So I took a look into this with the following findings: 1) The dtest is broken because it assumes that when you when c*-stress with n=1 you will end up with 10,000 rows inserted when I think the actual functional guarantee is that it will run 10,000 insert operations. 2) However, with the JDKRandomGenerator is assumption hold up to a few hundred thousand records. Even with n=1M you end up with 999,999 records in the table. For some reason, change to the library default Well19937c generator means no only is the assumption broken at n=10k but seem to get proportional worse as n increases. So, on those findings, I don't think changing the generator is a good idea. So, I tried to dig a bit deeper about what was causing the issue. As part of this, I wrote some code to generate values directly from the distributions in various ways and the results all seemed as expected (ie reasonably aligned with the distribution type). After a bit more digging, and to cut a long story short, I found that the actual is related to the -pop setting. I'm still a bit hazy on this but it seems -pop is the distribution of all possible keys. So, if I have a -pop of dist(1..10) I can only have 10 possible key values (ie combinations across all columns) no matter what the ranges specified for the key column in the YAML file are. The default for -pop is UNIFORM(1..n) where n is specified or 1..1,000,000 where no n is specified. I think this all results in somewhat counter-intuitive results, particular with multi-part keys. So, I think the actual answer here is to change the rules for the default -pop for yaml runs to have a population size equal to the product of the population size of each key as specified in the YAML. For example, if I have two columns: partition_key UNIFORM(1..1M) cluster_key UNIFORM(1..100) then the default population should be 1..100M. I think this is already implied by the YAML and what people would expect (certainly what I expected). I don't think this change will be too hard to make but interested to hear if anyone has an opinions before I jump into it. was (Author: slater_ben): So I took a look into this with the following findings: 1) The dtest is broken because it assumes that when you when c*-stress with n=1 you will end up with 10,000 rows inserted when I think the actual functional guarantee is that it will run 10,000 insert operations. 2) However, with the JDKRandomGenerator is assumption hold up to a few hundred thousand records. Even with n=1M you end up with 999,999 records in the table. For some reason, change to the library default Well19937c generator means no only is the assumption broken at n=10k but seem to get proportional worse as n increases. So, on those findings, I don't think changing the generator is a good idea. So, I tried to dig a bit deeper about what was causing the issue. As part of this, I wrote some code to generate values directly from the distributions in various ways and the results all seemed as expected (ie reasonably aligned with the distribution type). After a bit more digging, and to cut a long story short, I found that the actual is related to the -pop setting. I'm still a bit hazy on this but it seems -pop is the distribution of all possible keys. So, if I have a -pop of dist(1..10) I can only have 10 possible key values (ie combinations across all columns) no matter what the ranges specified for the key column in the YAML file are. The default for -pop is UNIFORM(1..n) where n is specified or 1..1,000,000 where no n is specified. I think this all results in somewhat counter-intuitive results, particular with multi-part keys. So, I think the actual answer here is to change the rules for the default -pop for yaml runs to have a population size equal to the product of the population size of each key as specified in the YAML. For example, if I have two columns: partition_key UNIFORM(1..1M) cluster_key UNIFORM(1..100) The the default population should be 1..100M. I think this is already implied by the YAML and what people would expect (certainly what I expected). I don't think this change will be two hard to make but interested to hear if anyone has an opinions before I jump into it. > Randomness of stress distributions is not good > -- > > Key: CASSANDRA-12744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12744 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: Ben Slater >Priority: Minor > Labels: stress >
[jira] [Commented] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027722#comment-16027722 ] Ben Slater commented on CASSANDRA-12744: So I took a look into this with the following findings: 1) The dtest is broken because it assumes that when you when c*-stress with n=1 you will end up with 10,000 rows inserted when I think the actual functional guarantee is that it will run 10,000 insert operations. 2) However, with the JDKRandomGenerator is assumption hold up to a few hundred thousand records. Even with n=1M you end up with 999,999 records in the table. For some reason, change to the library default Well19937c generator means no only is the assumption broken at n=10k but seem to get proportional worse as n increases. So, on those findings, I don't think changing the generator is a good idea. So, I tried to dig a bit deeper about what was causing the issue. As part of this, I wrote some code to generate values directly from the distributions in various ways and the results all seemed as expected (ie reasonably aligned with the distribution type). After a bit more digging, and to cut a long story short, I found that the actual is related to the -pop setting. I'm still a bit hazy on this but it seems -pop is the distribution of all possible keys. So, if I have a -pop of dist(1..10) I can only have 10 possible key values (ie combinations across all columns) no matter what the ranges specified for the key column in the YAML file are. The default for -pop is UNIFORM(1..n) where n is specified or 1..1,000,000 where no n is specified. I think this all results in somewhat counter-intuitive results, particular with multi-part keys. So, I think the actual answer here is to change the rules for the default -pop for yaml runs to have a population size equal to the product of the population size of each key as specified in the YAML. For example, if I have two columns: partition_key UNIFORM(1..1M) cluster_key UNIFORM(1..100) The the default population should be 1..100M. I think this is already implied by the YAML and what people would expect (certainly what I expected). I don't think this change will be two hard to make but interested to hear if anyone has an opinions before I jump into it. > Randomness of stress distributions is not good > -- > > Key: CASSANDRA-12744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12744 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: Ben Slater >Priority: Minor > Labels: stress > Fix For: 4.0 > > > The randomness of our distributions is pretty bad. We are using the > JDKRandomGenerator() but in testing of uniform(1..3) we see for 100 > iterations it's only outputting 3. If you bump it to 10k it hits all 3 > values. > I made a change to just use the default commons math random generator and now > see all 3 values for n=10 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021038#comment-16021038 ] Ben Slater commented on CASSANDRA-12744: [~tjake] - just realised this one was still open. If you can kick off the tests again, I'd be happy to dig into any issues. > Randomness of stress distributions is not good > -- > > Key: CASSANDRA-12744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12744 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: T Jake Luciani >Priority: Minor > Labels: stress > Fix For: 3.0.x > > > The randomness of our distributions is pretty bad. We are using the > JDKRandomGenerator() but in testing of uniform(1..3) we see for 100 > iterations it's only outputting 3. If you bump it to 10k it hits all 3 > values. > I made a change to just use the default commons math random generator and now > see all 3 values for n=10 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001830#comment-16001830 ] Ben Slater commented on CASSANDRA-8780: --- Excellent! Thanks [~tjake] for the assistance. > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 4.0 > > Attachments: 8780-trunk-v3.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16000128#comment-16000128 ] Ben Slater commented on CASSANDRA-8780: --- [~tjake] just thought I'd give you a nudge on this. Thanks! > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > Attachments: 8780-trunk-v3.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990062#comment-15990062 ] Ben Slater edited comment on CASSANDRA-8780 at 5/4/17 2:10 AM: --- Turned out to be a one-liner from when I rebased. One the plus side, I know how to run dtests now. Updated patch attached. was (Author: slater_ben): Turned out to be a one-liners from when I rebased. One the plus side, I know how to run dtests now. Updated patch attached. > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > Attachments: 8780-trunk-v3.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-8780: -- Attachment: 8780-trunk-v3.patch Turned out to be a one-liners from when I rebased. One the plus side, I know how to run dtests now. Updated patch attached. > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > Attachments: 8780-trunk-v3.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-8780: -- Status: Patch Available (was: Awaiting Feedback) > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > Attachments: 8780-trunk-v3.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-8780: -- Attachment: (was: 8780-trunkv2.txt) > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989473#comment-15989473 ] Ben Slater commented on CASSANDRA-8780: --- Yep, it should be backwards compatible. I'll take a look. Thanks for you help. > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > Attachments: 8780-trunkv2.txt > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979770#comment-15979770 ] Ben Slater commented on CASSANDRA-8780: --- [~tjake] Wondering if you are likely to have time to review this anytime soon or should we throw it back in the pool and see if anyone else is available to review? > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > Attachments: 8780-trunkv2.txt > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969853#comment-15969853 ] Ben Slater edited comment on CASSANDRA-8780 at 4/15/17 7:54 AM: Uploading a new patch rebased to current trunk. Moved to specifying a specname in the yaml and using that to identify ops. Realised this has the advantage of allowing you to have multiple yamls for a single table which could be useful in some circumstances. Disadvantage is it would make it more difficult to validate results (if someone tries to implement this in the future). Also put in a decent error message if an op references a missing spec. was (Author: slater_ben): Uploading a new patch. Moved to specifying a specname in the yaml and using that to identify ops. Realised this has the advantage of allowing you to have multiple yamls for a single table which could be useful in some circumstances. Disadvantage is it would make it more difficult to validate results (if someone tries to implement this in the future). > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > Attachments: 8780-trunkv2.txt > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-8780: -- Attachment: 8780-trunkv2.txt > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > Attachments: 8780-trunkv2.txt > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-8780: -- Status: Patch Available (was: Awaiting Feedback) Uploading a new patch. Moved to specifying a specname in the yaml and using that to identify ops. Realised this has the advantage of allowing you to have multiple yamls for a single table which could be useful in some circumstances. Disadvantage is it would make it more difficult to validate results (if someone tries to implement this in the future). > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-8780: -- Attachment: (was: 8780-trunk.patch) > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-8780: -- Status: Awaiting Feedback (was: In Progress) > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > Attachments: 8780-trunk.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12906) Update doco with new getting started with contribution section
[ https://issues.apache.org/jira/browse/CASSANDRA-12906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886718#comment-15886718 ] Ben Slater commented on CASSANDRA-12906: [~zznate] - just remembered this one and thought I'd give you a ping in case you have time to look > Update doco with new getting started with contribution section > -- > > Key: CASSANDRA-12906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12906 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Attachments: CASSANDRA-12906v2-trunk.patch > > > Following discussion on the mailing list about how to get more community > input it seemed to be agreed that adding some doco emphasising contributions > other than creating new features would be a good idea. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12694) PAXOS Update Corrupted empty row exception
[ https://issues.apache.org/jira/browse/CASSANDRA-12694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838987#comment-15838987 ] Ben Slater commented on CASSANDRA-12694: I am on leave until Monday 30 Jan. If you need an immediate response please contact [1]sa...@instaclustr.com or [2]supp...@instaclustr.com as appropriate. For less urgent queries, I will be checking email every couple of days and respond or redirect. Cheers Ben Slater Instaclustr -- Ben SlaterChief Product Officer[3]Instaclustr: Cassandra + Spark - Managed | Consulting | Support[4]www.instaclustr.com [1] mailto:sa...@instaclustr.com [2] mailto:supp...@instaclustr.com [3] https://www.instaclustr.com [4] http://www.instaclustr.com > PAXOS Update Corrupted empty row exception > -- > > Key: CASSANDRA-12694 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12694 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths > Environment: 3 node cluster using RF=3 running on cassandra 3.7 >Reporter: Cameron Zemek >Assignee: Alex Petrov > Fix For: 3.0.11, 3.10 > > > {noformat} > cqlsh> create table test.test (test_id TEXT, last_updated TIMESTAMP, > message_id TEXT, PRIMARY KEY(test_id)); > update test.test set last_updated = 1474494363669 where test_id = 'test1' if > message_id = null; > {noformat} > Then nodetool flush on the all 3 nodes. > {noformat} > cqlsh> update test.test set last_updated = 1474494363669 where test_id = > 'test1' if message_id = null; > ServerError: > {noformat} > From cassandra log > {noformat} > ERROR [SharedPool-Worker-1] 2016-09-23 12:09:13,179 Message.java:611 - > Unexpected exception during request; channel = [id: 0x7a22599e, > L:/127.0.0.1:9042 - R:/127.0.0.1:58297] > java.io.IOError: java.io.IOException: Corrupt empty row found in unfiltered > partition > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:224) > ~[main/:na] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:212) > ~[main/:na] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[main/:na] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators.digest(UnfilteredRowIterators.java:125) > ~[main/:na] > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators.digest(UnfilteredPartitionIterators.java:249) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse.makeDigest(ReadResponse.java:87) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$DataResponse.digest(ReadResponse.java:192) > ~[main/:na] > at > org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:80) > ~[main/:na] > at > org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:139) > ~[main/:na] > at > org.apache.cassandra.service.AbstractReadExecutor.get(AbstractReadExecutor.java:145) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(StorageProxy.java:1714) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1663) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1604) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1523) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy.readOne(StorageProxy.java:1497) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy.readOne(StorageProxy.java:1491) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy.cas(StorageProxy.java:249) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.ModificationStatement.executeWithCondition(ModificationStatement.java:441) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:416) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:208) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:239) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:224) > ~[main/:na] > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115) > ~[main/:na] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507) > [main/:na] > at >
[jira] [Commented] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826181#comment-15826181 ] Ben Slater commented on CASSANDRA-12744: I am on leave until Monday 30 Jan. If you need an immediate response please contact [1]sa...@instaclustr.com or [2]supp...@instaclustr.com as appropriate. For less urgent queries, I will be checking email every couple of days and respond or redirect. Cheers Ben Slater Instaclustr -- Ben SlaterChief Product Officer[3]Instaclustr: Cassandra + Spark - Managed | Consulting | Support[4]www.instaclustr.com [1] mailto:sa...@instaclustr.com [2] mailto:supp...@instaclustr.com [3] https://www.instaclustr.com [4] http://www.instaclustr.com > Randomness of stress distributions is not good > -- > > Key: CASSANDRA-12744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12744 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: T Jake Luciani >Priority: Minor > Labels: stress > Fix For: 3.0.x > > > The randomness of our distributions is pretty bad. We are using the > JDKRandomGenerator() but in testing of uniform(1..3) we see for 100 > iterations it's only outputting 3. If you bump it to 10k it hits all 3 > values. > I made a change to just use the default commons math random generator and now > see all 3 values for n=10 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15771303#comment-15771303 ] Ben Slater commented on CASSANDRA-8780: --- Hmm, still not sure we're on the same page here. The KS and Table from the yaml are still used the same as in the existing implementation. They are only added to the command line to link up the ops specified there with the relevant yaml spec. If only one table is being stressed (one yaml spec file) then you can actually leave out the ks and table on the command line and the single yaml file will automatically be used (to maintain backwards compatibility and simplicity in the single table case). Where multiple yamls are specified, we can't assume that op names will be unique across yamls (insert, for example) so there needs to be some qualification of the ops specified on the command line to link them to the yaml file. I chose to use ks.table as this qualifier as it seemed to me like a pretty natural link. The alternatives I can see would be either (a) introduce a new, arbitrary spec name in the yaml file and use that to qualify the op names on the command line , (b) qualify the ops with some part of the yaml file name or (c) require op names to be globally unique across multiple yaml files. To me, I still think ks.table is the best of the alternatives for the qualifier but I'm happy to be persuaded otherwise. I did notice on looking through the code again that I did a poor to non-existent job of handling the case where someone specifies a ks.table that's not in a yaml file so will update the code (and rebase) for that once we get this point agreed. > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.x > > Attachments: 8780-trunk.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15768271#comment-15768271 ] Ben Slater commented on CASSANDRA-8780: --- Thanks for reviewing [~tjake]. I'm using the keyspace + table as basically the name for the profile (with the check that there is only one profile per ks.table). So, the regex is parsing the command line in order to match what is specified there with a profile (which is really the source of truth for ks.table names). I think the alternative would be to add an explicit name field into the profile although I'm not sure that would make things a whole lot better (other than providing a path should we want to have multiple profiles per ks.table in the future which, now I think of it, could be potentially useful if a bit esoteric). Any guidance on what you'd like to see? > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.x > > Attachments: 8780-trunk.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12906) Update doco with new getting started with contribution section
[ https://issues.apache.org/jira/browse/CASSANDRA-12906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15739235#comment-15739235 ] Ben Slater commented on CASSANDRA-12906: [~zznate] just thought I'd give you a reminder on this. Hopefully I've got the patch format right now! > Update doco with new getting started with contribution section > -- > > Key: CASSANDRA-12906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12906 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Attachments: CASSANDRA-12906v2-trunk.patch > > > Following discussion on the mailing list about how to get more community > input it seemed to be agreed that adding some doco emphasising contributions > other than creating new features would be a good idea. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12906) Update doco with new getting started with contribution section
[ https://issues.apache.org/jira/browse/CASSANDRA-12906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12906: --- Attachment: CASSANDRA-12906v2-trunk.patch Hopefully I've got the patch right this time. Having a shocker! > Update doco with new getting started with contribution section > -- > > Key: CASSANDRA-12906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12906 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Attachments: CASSANDRA-12906v2-trunk.patch > > > Following discussion on the mailing list about how to get more community > input it seemed to be agreed that adding some doco emphasising contributions > other than creating new features would be a good idea. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12906) Update doco with new getting started with contribution section
[ https://issues.apache.org/jira/browse/CASSANDRA-12906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12906: --- Attachment: (was: 12906-trunk.patch) > Update doco with new getting started with contribution section > -- > > Key: CASSANDRA-12906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12906 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > > Following discussion on the mailing list about how to get more community > input it seemed to be agreed that adding some doco emphasising contributions > other than creating new features would be a good idea. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12906) Update doco with new getting started with contribution section
[ https://issues.apache.org/jira/browse/CASSANDRA-12906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12906: --- Attachment: 12906-trunk.patch updated git CLI'd patch attached. > Update doco with new getting started with contribution section > -- > > Key: CASSANDRA-12906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12906 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Attachments: 12906-trunk.patch > > > Following discussion on the mailing list about how to get more community > input it seemed to be agreed that adding some doco emphasising contributions > other than creating new features would be a good idea. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12906) Update doco with new getting started with contribution section
[ https://issues.apache.org/jira/browse/CASSANDRA-12906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12906: --- Attachment: (was: CASSANDRA_12906-trunk.patch) > Update doco with new getting started with contribution section > -- > > Key: CASSANDRA-12906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12906 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > > Following discussion on the mailing list about how to get more community > input it seemed to be agreed that adding some doco emphasising contributions > other than creating new features would be a good idea. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12906) Update doco with new getting started with contribution section
[ https://issues.apache.org/jira/browse/CASSANDRA-12906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12906: --- Status: Patch Available (was: Open) Draft updates in the attached patch. Substantial changes are only in the new file (gettingstarted.rst) other changes are just to update the index and add labels (anchors) in the existing files for linking. > Update doco with new getting started with contribution section > -- > > Key: CASSANDRA-12906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12906 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Attachments: CASSANDRA_12906-trunk.patch > > > Following discussion on the mailing list about how to get more community > input it seemed to be agreed that adding some doco emphasising contributions > other than creating new features would be a good idea. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12906) Update doco with new getting started with contribution section
[ https://issues.apache.org/jira/browse/CASSANDRA-12906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12906: --- Attachment: CASSANDRA_12906-trunk.patch > Update doco with new getting started with contribution section > -- > > Key: CASSANDRA-12906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12906 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Attachments: CASSANDRA_12906-trunk.patch > > > Following discussion on the mailing list about how to get more community > input it seemed to be agreed that adding some doco emphasising contributions > other than creating new features would be a good idea. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12906) Update doco with new getting started with contribution section
Ben Slater created CASSANDRA-12906: -- Summary: Update doco with new getting started with contribution section Key: CASSANDRA-12906 URL: https://issues.apache.org/jira/browse/CASSANDRA-12906 Project: Cassandra Issue Type: Improvement Components: Documentation and Website Reporter: Ben Slater Assignee: Ben Slater Priority: Minor Following discussion on the mailing list about how to get more community input it seemed to be agreed that adding some doco emphasising contributions other than creating new features would be a good idea. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12490: --- Attachment: 12490updatev2-trunk.patch Updated patch (to current trunk including previous patch) attached. (Squeezed it in to Friday afternoon.) > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, > 12490updatev2-trunk.patch, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12490: --- Attachment: (was: 12490update-trunk.patch) > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593928#comment-15593928 ] Ben Slater commented on CASSANDRA-12490: Yep, will take a look over the (Australian) weekend. > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, 12490update-trunk.patch, > cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581709#comment-15581709 ] Ben Slater commented on CASSANDRA-12490: ok, agreed. > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, 12490update-trunk.patch, > cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581194#comment-15581194 ] Ben Slater commented on CASSANDRA-12490: Thanks for the feedback. You're right - next.set(seed) would be a better implementation - I didn't check if there was another method of setting the value of an AtomicLong. I can update a the patch if the decision is to go ahead. It still generates min..max as nextWithWrap() (which is called by the other variants of next()) returns start + (next % totalCount) so it starts again from min if it goes past max. I think the need for this is less with Jake's fix to the random generator in CASSANDRA-12744. However, I still think it serves a purpose for loading background data for a test without overlap. However, if I'm over-ruled on that so be it. > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, 12490update-trunk.patch, > cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15580762#comment-15580762 ] Ben Slater commented on CASSANDRA-12744: I tried this patch out. It definitely seems to improves distribution to something like what you'd expect. Didn't notice any issues. > Randomness of stress distributions is not good > -- > > Key: CASSANDRA-12744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12744 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: T Jake Luciani >Priority: Minor > Labels: stress > Fix For: 3.0.10 > > > The randomness of our distributions is pretty bad. We are using the > JDKRandomGenerator() but in testing of uniform(1..3) we see for 100 > iterations it's only outputting 3. If you bump it to 10k it hits all 3 > values. > I made a change to just use the default commons math random generator and now > see all 3 values for n=10 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12490: --- Status: Patch Available (was: Reopened) Patch attached. > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, 12490update-trunk.patch, > cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12490: --- Attachment: 12490update-trunk.patch I've attached the patch to fix the implementation of setSeed() (update to current trunk rather than complete new patch). [~aboudreault] the same change also fixes the issue you mentioned where things did work properly with multiple threads. > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, 12490update-trunk.patch, > cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573973#comment-15573973 ] Ben Slater commented on CASSANDRA-12490: I realised [~tjake] was saying the a validation error is the expected behaviour and occurs in 3.9 but not trunk. I just tried but can't get a validation error in 3.9 with a YAML file (as I said, I wasn't aware that validation functionality existed for YAML specs). Can you provide some more details on how to reproduce? > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573966#comment-15573966 ] Ben Slater commented on CASSANDRA-12490: Moving back from dev list where I think the discussion ended up by accident. Jake said: No I'm not using a seq anywhere else then the command line I said: OK, I think it’s pretty unlikely to be this change as I didn’t change the existing code (certainly nothing near what is used by -pop) and also I just noticed you said you had the issue in 3.9 and CASS-12490 is destined for 3.10. Also, last time I looked, I thought stress didn’t validate returned results for YAML specs. Did I miss something or did that get added recently? Can you add your actual command, etc to the ticket? Anyway, I will try to do some more digging over the weekend as I still suspect there is something wrong (or at least unexpected) going on aside from this change. > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573119#comment-15573119 ] Ben Slater commented on CASSANDRA-12490: Just to check [~tjake] when you say "this also breaks validation", I assume you mean it breaks validation when you use the sequence distribution type, not in the case where you don't use seq()? > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573116#comment-15573116 ] Ben Slater commented on CASSANDRA-12490: Yeah, that would be my misunderstanding and misimplementation of setSeed(). The fix appears to be trivial (discussed somewhere in the wall of text above). I'll test a bit more and submit a patch in the next day or two. > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15571589#comment-15571589 ] Ben Slater edited comment on CASSANDRA-12490 at 10/13/16 10:57 AM: --- OK, so I did some more investigation this evening to try to better understand this and found a few interesting things. I suspect there is at least on bug here but I'll be interested to see what you think. I set up a simple spec to test what was going on: {code} table: test4 table_definition: | CREATE TABLE test4 ( pk text, val text, PRIMARY KEY (pk) ) columnspec: - name: pk size: fixed(32) population: uniform(1..50) {code} When I run this with `ops(insert=1) n=50` the end result is 1 row added to the table. When I run it with n=500 I get 3 rows. Some other observations: a) tracing this through it seems that's because the small number of seed values from the population (due to the small n=) results in a very low variation in values being returned from `delegate.sample()` in `DistributionBoundApache.next()`. They were (all 37Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15571589#comment-15571589 ] Ben Slater commented on CASSANDRA-12490: OK, so I did some more investigation this evening to try to better understand this and found a few interesting things. I suspect there is at least on bug here but I'll be interested to see what you think. I set up a simple spec to test what was going on: ``` table: test4 table_definition: | CREATE TABLE test4 ( pk text, val text, PRIMARY KEY (pk) ) columnspec: - name: pk size: fixed(32) population: uniform(1..50)``` When I run this with `ops(insert=1) n=50` the end result is 1 row added to the table. When I run it with n=500 I get 3 rows. Some other observations: a) tracing this through it seems that's because the small number of seed values from the population (due to the small n=) results in a very low variation in values being returned from `delegate.sample()` in `DistributionBoundApache.next()`. They were (all 37Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564148#comment-15564148 ] Ben Slater commented on CASSANDRA-12490: Yes, you're right resetting the counter to zero on setSeed() does result in the same row being generated over and over again (which does make me wonder how stress is respecting the distribution for the PK value but didn't investigate at this point). However, that is pretty easily fixed by having setSeed() set the counter to the supplied seed value. I think once we do this SEQ behaves very similarly to the other distributions. I don't think it's correct that stress generates every value if the number of unique values it can generate is <= the number of values it is being asked to generate for a partition. This would only respect the distribution in the case of uniform distribution, however even then I don't think it's guaranteed to be completely uniform (and thus generate all values) from n samples of a 1..n distribution (you probably need to do many * n to get very close to uniform) - it certainly doesn't seem to behave this way in testing. For say normal distribution you'd need several * n to cover all the possible values and have close to a normal distribution. I afraid I don't really understand why you think this is abusing the notion of distributions when (a) there was already a sequence distribution type in the "legacy" distribution sets (presumably for just this purpose) and (b) to me, one way of describing this is a uniform distribution with minimal chance of collisions (ie it's just another way for selecting values from a range). Finally, it's not quite correct to say I'm trying to populate all possible values for a column, rather trying to generate as many unique values as possible (within the specified ranges) for a given sample size (to minimise overwriting). > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561852#comment-15561852 ] Ben Slater commented on CASSANDRA-12490: Hi Benedict, I must be missing something here because as far as I can tell from testing a few different scenarios, setting -pop seq=1..N doesn't have any impact on the set of data generated when used with a YAML file. That aside, the intent is that you use the SEQ distribution for doing an initial load of background data before running say a read test or a mixed read/write test so that you are running with a representative volume of data on disk (and that you would probably wouldn't use SEQ for these later tests). In that case you wouldn't expect/care whether the set of data generated initially lines up in the same order as what is generated by later runs (although you would expect them to be from the same overall populations of values which I believe does hold). I believe the sequence of data generation would have to change similarly if you changed between existing distribution types between runs? Looking again at the code, I can see how the current implementation of SEQ is any issue for implementation future data validation as it doesn't "reset" as you visit each partition. I think the other distributions effectively rest due to the call to setSeed(). However, I think this can fairly easily be rectified by having the setSeed() implementation of DistrubtionSequence reset the next value to 0? Cheers Ben > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12718) typo in cql examples
[ https://issues.apache.org/jira/browse/CASSANDRA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15528456#comment-15528456 ] Ben Slater commented on CASSANDRA-12718: Yeah, that's probably the way you'd typically do it although I guess having it as an int illustrates using a non-text data type. > typo in cql examples > - > > Key: CASSANDRA-12718 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12718 > Project: Cassandra > Issue Type: Bug > Components: Documentation and Website >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Trivial > > select query on sets using the wrong field -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12718) typo in cql examples
[ https://issues.apache.org/jira/browse/CASSANDRA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15528435#comment-15528435 ] Ben Slater edited comment on CASSANDRA-12718 at 9/28/16 5:14 AM: - The JSON in the insert also needs to be updated with a couple of typos (comma before work and remove quotes around zip). Probably easiest to do them as one ticket. Fixed version is: INSERT INTO user (name, addresses) VALUES ('z3 Pr3z1den7', { 'home' : { street: '1600 Pennsylvania Ave NW', city: 'Washington', zip: 20500, phones: { 'cell' : { country_code: 1, number: '202 456-' }, 'landline' : { country_code: 1, number: '...' } } }, 'work' : { street: '1600 Pennsylvania Ave NW', city: 'Washington', zip: 20500, phones: { 'fax' : { country_code: 1, number: '...' } } } }) was (Author: slater_ben): The JSON in the insert also needs to be updated with a couple of typos (comma before work and remove quotes around zip). Probably easiest to do them as one ticket. Fixed version is: ```INSERT INTO user (name, addresses) VALUES ('z3 Pr3z1den7', { 'home' : { street: '1600 Pennsylvania Ave NW', city: 'Washington', zip: 20500, phones: { 'cell' : { country_code: 1, number: '202 456-' }, 'landline' : { country_code: 1, number: '...' } } }, 'work' : { street: '1600 Pennsylvania Ave NW', city: 'Washington', zip: 20500, phones: { 'fax' : { country_code: 1, number: '...' } } } })``` > typo in cql examples > - > > Key: CASSANDRA-12718 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12718 > Project: Cassandra > Issue Type: Bug > Components: Documentation and Website >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Trivial > > select query on sets using the wrong field -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12718) typo in cql examples
[ https://issues.apache.org/jira/browse/CASSANDRA-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15528435#comment-15528435 ] Ben Slater commented on CASSANDRA-12718: The JSON in the insert also needs to be updated with a couple of typos (comma before work and remove quotes around zip). Probably easiest to do them as one ticket. Fixed version is: ```INSERT INTO user (name, addresses) VALUES ('z3 Pr3z1den7', { 'home' : { street: '1600 Pennsylvania Ave NW', city: 'Washington', zip: 20500, phones: { 'cell' : { country_code: 1, number: '202 456-' }, 'landline' : { country_code: 1, number: '...' } } }, 'work' : { street: '1600 Pennsylvania Ave NW', city: 'Washington', zip: 20500, phones: { 'fax' : { country_code: 1, number: '...' } } } })``` > typo in cql examples > - > > Key: CASSANDRA-12718 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12718 > Project: Cassandra > Issue Type: Bug > Components: Documentation and Website >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Trivial > > select query on sets using the wrong field -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15524405#comment-15524405 ] Ben Slater commented on CASSANDRA-12490: Actually, a colleague of mine just submitted patch for that issue last week - it occurs regardless of distribution but is probably more obvious using SEQ. See https://issues.apache.org/jira/browse/CASSANDRA-11138 > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492826#comment-15492826 ] Ben Slater commented on CASSANDRA-8780: --- Just thought I'd give this a bump to see if I could get a review before I forget what I did :-) > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.x > > Attachments: 8780-trunk.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15463920#comment-15463920 ] Ben Slater edited comment on CASSANDRA-8780 at 9/6/16 10:02 PM: I've attached a patch that implements this. Key points: - existing command lines, spec files, etc should work unchanged - to test multiple files: - put a comma separated list of files names for spec= (eg spec=file1.txt,file2.txt) - when specifying operations, qualify with keyspace and table name (eg ops(stressexample.eventsrawtest.insert=1,stressexample.eventsrawtest2.insert=1,stressexample.eventsrawtest2.get-a-value=1,stressexample.eventsrawtest2.pull-for-rollup=1) ) - I looked at doing unit tests but couldn't really see a way of testing without building framework for a whole lot of existing functionality I've tested a few different scenarios on files I had lying around and all seems to work. was (Author: slater_ben): I've attached a patch that implements this. Key points: - existing command lines, spec files, etc should work unchanged - to test multiple files: - put a comma separated list of files names for spec= (eg spec=file1.txt,spec=file2.txt) - when specifying operations, qualify with keyspace and table name (eg ops(stressexample.eventsrawtest.insert=1,stressexample.eventsrawtest2.insert=1,stressexample.eventsrawtest2.get-a-value=1,stressexample.eventsrawtest2.pull-for-rollup=1) ) - I looked at doing unit tests but couldn't really see a way of testing without building framework for a whole lot of existing functionality I've tested a few different scenarios on files I had lying around and all seems to work. > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.x > > Attachments: 8780-trunk.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15463920#comment-15463920 ] Ben Slater edited comment on CASSANDRA-8780 at 9/5/16 4:38 AM: --- I've attached a patch that implements this. Key points: - existing command lines, spec files, etc should work unchanged - to test multiple files: - put a comma separated list of files names for spec= (eg spec=file1.txt,spec=file2.txt) - when specifying operations, qualify with keyspace and table name (eg ops(stressexample.eventsrawtest.insert=1,stressexample.eventsrawtest2.insert=1,stressexample.eventsrawtest2.get-a-value=1,stressexample.eventsrawtest2.pull-for-rollup=1) ) - I looked at doing unit tests but couldn't really see a way of testing without building framework for a whole lot of existing functionality I've tested a few different scenarios on files I had lying around and all seems to work. was (Author: slater_ben): I've attached a patch that implements this. Key points: - existing command lines, spec files, etc should work unchanged - to test multiple files: - put a comma separated list of files names for spec= (eg spec=file1.txt,spec=file2.txt) - when specifying operations, qualify with keyspace and table name (eg ops(stressexample.eventsrawtest.insert=1,stressexample.eventsrawtest2.insert=1,stressexample.eventsrawtest2.get-a-value=1,stressexample.eventsrawtest2.pull-for-rollup=1) ) I've tested a few different scenarios on files I had lying around and all seems to work. > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.x > > Attachments: 8780-trunk.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-8780: -- Attachment: 8780-trunk.patch Actually attaching the patch this time. > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.x > > Attachments: 8780-trunk.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-8780: -- Assignee: Ben Slater Fix Version/s: 3.x Status: Patch Available (was: Open) I've attached a patch that implements this. Key points: - existing command lines, spec files, etc should work unchanged - to test multiple files: - put a comma separated list of files names for spec= (eg spec=file1.txt,spec=file2.txt) - when specifying operations, qualify with keyspace and table name (eg ops(stressexample.eventsrawtest.insert=1,stressexample.eventsrawtest2.insert=1,stressexample.eventsrawtest2.get-a-value=1,stressexample.eventsrawtest2.pull-for-rollup=1) ) I've tested a few different scenarios on files I had lying around and all seems to work. > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.x > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12473) Errors in cassandra-stress print settings output
[ https://issues.apache.org/jira/browse/CASSANDRA-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15461884#comment-15461884 ] Ben Slater commented on CASSANDRA-12473: Just a reminder on this - it's still waiting for review/commit. Cheers Ben > Errors in cassandra-stress print settings output > > > Key: CASSANDRA-12473 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12473 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > Attachments: 11914-NPEFix-trunk.txt, 12473-trunk.txt > > > A few errors in stress settings output: > - mean and stdev transposed for gaussian distribution output > - no-settings setting mislabled "Print settings" > - typo "ration" instead of "ratio" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432105#comment-15432105 ] Ben Slater commented on CASSANDRA-12490: Nits look good to me. I think the example is sufficient - the ones I've used don't really illustrate anything additional. Thanks! > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > Attachments: 12490-trunk.patch, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12490: --- Attachment: (was: 12490-trunk.patch) > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > Attachments: 12490-trunk.patch > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12490: --- Attachment: 12490-trunk.patch Updated patch attached. > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > Attachments: 12490-trunk.patch, 12490-trunk.patch > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430211#comment-15430211 ] Ben Slater commented on CASSANDRA-12490: Makes sense - I'll fix up inverseCumProb() and add a test. > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > Attachments: 12490-trunk.patch > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430002#comment-15430002 ] Ben Slater commented on CASSANDRA-12490: I'm pretty sure my implementation of inverseCumProb() is incorrect but it doesn't appear this practically matters. Happy to update if someone can explain what it's supposed to be returning. > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > Attachments: 12490-trunk.patch > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12473) Errors in cassandra-stress print settings output
[ https://issues.apache.org/jira/browse/CASSANDRA-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12473: --- Attachment: 11914-NPEFix-trunk.txt Also attaching a fix for another NPE issue reported under CASSANDRA-11914 so it doesn't get lost. > Errors in cassandra-stress print settings output > > > Key: CASSANDRA-12473 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12473 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > Attachments: 11914-NPEFix-trunk.txt, 12473-trunk.txt > > > A few errors in stress settings output: > - mean and stdev transposed for gaussian distribution output > - no-settings setting mislabled "Print settings" > - typo "ration" instead of "ratio" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-11914: --- Attachment: 11914-NPEFix-trunk.txt Whoops - didn't realise you could skip those sections of the profile. Patch attached. I will also attach this to CASSANDRA-12473 which was opened to fix a couple of other minor issues. [~tjake] or [~jkni] let me know if the patch would be better in a different format. > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > Attachments: 11914-NPEFix-trunk.txt > > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12490: --- Fix Version/s: 3.x Status: Patch Available (was: Open) > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > Attachments: 12490-trunk.patch > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
[ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12490: --- Attachment: 12490-trunk.patch Patch attached > Add sequence distribution type to cassandra stress > -- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Attachments: 12490-trunk.patch > > > When using the write command, cassandra stress sequentially generates seeds. > This ensures generated values don't overlap (unless the sequence wraps) > providing more predictable number of inserted records (and generating a base > set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. > It think it would be useful to have this for doing initial load of data for > testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12490) Add sequence distribution type to cassandra stress
Ben Slater created CASSANDRA-12490: -- Summary: Add sequence distribution type to cassandra stress Key: CASSANDRA-12490 URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Ben Slater Assignee: Ben Slater Priority: Minor When using the write command, cassandra stress sequentially generates seeds. This ensures generated values don't overlap (unless the sequence wraps) providing more predictable number of inserted records (and generating a base set of data without wasted writes). When using a yaml stress spec there is no sequenced distribution available. It think it would be useful to have this for doing initial load of data for testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12473) Errors in cassandra-stress print settings output
[ https://issues.apache.org/jira/browse/CASSANDRA-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15423810#comment-15423810 ] Ben Slater edited comment on CASSANDRA-12473 at 8/17/16 3:35 AM: - Patch attached. Thanks Joel for the review. was (Author: slater_ben): Patch attached > Errors in cassandra-stress print settings output > > > Key: CASSANDRA-12473 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12473 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > Attachments: 12473-trunk.txt > > > A few errors in stress settings output: > - mean and stdev transposed for gaussian distribution output > - no-settings setting mislabled "Print settings" > - typo "ration" instead of "ratio" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12473) Errors in cassandra-stress print settings output
[ https://issues.apache.org/jira/browse/CASSANDRA-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12473: --- Fix Version/s: 3.x Status: Patch Available (was: Open) > Errors in cassandra-stress print settings output > > > Key: CASSANDRA-12473 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12473 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > Attachments: 12473-trunk.txt > > > A few errors in stress settings output: > - mean and stdev transposed for gaussian distribution output > - no-settings setting mislabled "Print settings" > - typo "ration" instead of "ratio" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12473) Errors in cassandra-stress print settings output
[ https://issues.apache.org/jira/browse/CASSANDRA-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15423810#comment-15423810 ] Ben Slater commented on CASSANDRA-12473: Patch attached > Errors in cassandra-stress print settings output > > > Key: CASSANDRA-12473 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12473 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > Attachments: 12473-trunk.txt > > > A few errors in stress settings output: > - mean and stdev transposed for gaussian distribution output > - no-settings setting mislabled "Print settings" > - typo "ration" instead of "ratio" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12473) Errors in cassandra-stress print settings output
[ https://issues.apache.org/jira/browse/CASSANDRA-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-12473: --- Attachment: 12473-trunk.txt > Errors in cassandra-stress print settings output > > > Key: CASSANDRA-12473 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12473 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > Attachments: 12473-trunk.txt > > > A few errors in stress settings output: > - mean and stdev transposed for gaussian distribution output > - no-settings setting mislabled "Print settings" > - typo "ration" instead of "ratio" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12473) Errors in cassandra-stress print settings output
Ben Slater created CASSANDRA-12473: -- Summary: Errors in cassandra-stress print settings output Key: CASSANDRA-12473 URL: https://issues.apache.org/jira/browse/CASSANDRA-12473 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Ben Slater Assignee: Ben Slater Priority: Minor A few errors in stress settings output: - mean and stdev transposed for gaussian distribution output - no-settings setting mislabled "Print settings" - typo "ration" instead of "ratio" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422600#comment-15422600 ] Ben Slater commented on CASSANDRA-11914: Hi, I just noticed a couple of small issues with this: 1 typo and more significantly mean and stdev transposed for Guassian distribution. Should I open a new Jira or provide an update on this one? Cheers Ben > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.10 > > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408854#comment-15408854 ] Ben Slater commented on CASSANDRA-11914: Have updated the branch with the following: - fixed the couple of misplaced brackets - make defaults for java connection settings explicit so they print - ensure that schema is created before creating distribution to print settings (modifed maybeCreateSchema so a run will only try once to create the schema) - removed a couple of superflous private options variables I had left behind - make print settings the default with -log no-settings option to turn off > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403249#comment-15403249 ] Ben Slater commented on CASSANDRA-11914: Turns out the "create schema" problem is not so much that it doesn't create the schema but rather that trying to do the calcs to display the "Generating batches with x partitions and x rows ..." message results in a call to maybeLoadSchemaInfo() which fails if the schema doesn't exist (creating the schema still works if you don't do print-settings). So, I think the alternatives to fix are either (a) create the schema (if necessary) at the time printSettings() is called, (b) put the "Generating batches ..." message back where is was in getInsert() at the start of the run or (c) move printSettings() until after the run is complete. I think my preference is (b) as creating a schema sounds like a very unexpected side effect of printSettings() and you probably want to be able to print settings without completing a full run (ruling out (c)). However, let me know if you can see a different solution or prefer a different option. Cheers Ben > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Assignee: Ben Slater >Priority: Minor > Fix For: 3.x > > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398788#comment-15398788 ] Ben Slater commented on CASSANDRA-11914: Aah, OK - didn't quite get that that was what you were referring to. Thanks - included now. Started with a new copy of trunk which fixed the build errors I was having. The update changes are in a branch here: https://github.com/slater-ben/cassandra/tree/11914-trunk. > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Priority: Minor > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-11914: --- Attachment: (was: 11914-trunk.patch) > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Priority: Minor > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-11914: --- Reproduced In: 3.7 Tester: Ben Slater Since Version: 3.7 Status: Patch Available (was: Awaiting Feedback) > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Priority: Minor > Attachments: 11914-trunk.patch > > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397162#comment-15397162 ] Ben Slater edited comment on CASSANDRA-11914 at 7/28/16 7:20 AM: - Attached updated patch should be a pretty much complete implementation ready for review. Two things to note: - the patch is based on a slightly out of date version of trunk as current trunk doesn't seem to be building (even without any changes). The difference is the changed files are very minor. - I couldn't find the distribution printed at the start of the stress run that [~tjake] referred to. Happy to add it in if you can give me a bit more of a pointer. was (Author: slater_ben): Working implementation. > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Priority: Minor > Attachments: 11914-trunk.patch > > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-11914: --- Attachment: 11914-trunk.patch Working implementation. > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Priority: Minor > Attachments: 11914-trunk.patch > > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-11914: --- Attachment: (was: 11914-trunk.patch) > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Priority: Minor > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367261#comment-15367261 ] Ben Slater commented on CASSANDRA-11914: Hi - just thought I'd give this a bump. This isn't a terribly exciting task so don't want to waste my time if it's not deemed as useful. > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Priority: Minor > Attachments: 11914-trunk.patch > > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305173#comment-15305173 ] Ben Slater edited comment on CASSANDRA-11914 at 5/28/16 4:47 AM: - I've attach a proof of concept patch the just prints the command setting and related options. If people think this is useful and are OK with the general direction then I'll finish off in this way for the remaining settings and options. was (Author: slater_ben): Proof of concept only > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Priority: Minor > Attachments: 11914-trunk.patch > > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
[ https://issues.apache.org/jira/browse/CASSANDRA-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-11914: --- Attachment: 11914-trunk.patch Proof of concept only > Provide option for cassandra-stress to dump all settings > > > Key: CASSANDRA-11914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Ben Slater >Priority: Minor > Attachments: 11914-trunk.patch > > > cassandra-stress has quite a lot of default settings and settings that are > derived as side effects of explicit options. For people learning the tool and > saving a clear record of what was run, I think it would be useful if there > was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11914) Provide option for cassandra-stress to dump all settings
Ben Slater created CASSANDRA-11914: -- Summary: Provide option for cassandra-stress to dump all settings Key: CASSANDRA-11914 URL: https://issues.apache.org/jira/browse/CASSANDRA-11914 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Ben Slater Priority: Minor cassandra-stress has quite a lot of default settings and settings that are derived as side effects of explicit options. For people learning the tool and saving a clear record of what was run, I think it would be useful if there was an option to have the tool print all its settings at the start of a run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)