[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins
[ https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253264#comment-14253264 ] Benedict commented on CASSANDRA-8503: - bq. as a reason to not do Not at all. I am suggesting we take a timeseries data model and workload and see if we can represent it in a way that is functionally equivalent for the tests we will be performing (which, note, will be short lived tests so DTCS and other compaction considerations won't be so important) with stress' current facilities. It may not *look* identical, but it could be functionally equivalent all the same. If it comes up short at being able to represent it, we can introduce some small tweaks to make it much closer still (such as using the seed to bound the timestamp generation for PK and clustering columns, at which point I think the only possible limitation would be certain kinds of query slicing behaviours). The main impediment is 7980. I'm not sure what you mean about structural vs functional. It's not a dramatically difficult feature to add, it's just allocating the time to do so. Collect important stress profiles for regression analysis done by jenkins - Key: CASSANDRA-8503 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503 Project: Cassandra Issue Type: Task Reporter: Ryan McGuire Assignee: Ryan McGuire Attachments: inmemory.yaml, ycsb.yaml We have a weekly job setup on CassCI to run a performance benchmark against the dev branches as well as the last stable releases. Here's an example: http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f This test is currently pretty basic, it's running on three nodes, with a the default stress profile. We should crowdsource a collection of stress profiles to run, and then once we have many of these tests running we can collect them all into a weekly email. Ideas: * Timeseries (Can this be done with stress? not sure) * compact storage * compression off * ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins
[ https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252297#comment-14252297 ] Benedict commented on CASSANDRA-8503: - I would quite like to see us also generate some profiles and cluster configs specifically designed to elicit failure scenarios. Perhaps in conjunction with some stress enhancements specifically designed to prevent it knocking a server over, but keep it under as much load as possible to tease out unexpected behaviour. For instance random interleaving of all operation types (including range queries, counters, etc) with more compaction (smaller LCS tables, smaller preemptive reopen interval). Then over time expand this testing to include periodically taking down servers, bootstrap new ones, etc. it would be great to occupy our test clusters with these tests automatically during any unused performance testing time. It seems without this we lack robust acceptance testing. Collect important stress profiles for regression analysis done by jenkins - Key: CASSANDRA-8503 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503 Project: Cassandra Issue Type: Task Reporter: Ryan McGuire Assignee: Ryan McGuire Attachments: inmemory.yaml, ycsb.yaml We have a weekly job setup on CassCI to run a performance benchmark against the dev branches as well as the last stable releases. Here's an example: http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f This test is currently pretty basic, it's running on three nodes, with a the default stress profile. We should crowdsource a collection of stress profiles to run, and then once we have many of these tests running we can collect them all into a weekly email. Ideas: * Timeseries (Can this be done with stress? not sure) * compact storage * compression off * ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins
[ https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252767#comment-14252767 ] Jonathan Shook commented on CASSANDRA-8503: --- [~benedict] Unless I am mistaken, you are describing an issue in the data generation of stress as a reason to not do a meaningful time series sanity check in CI. My gut feeling on what might qualify as 'sufficiently close' is that we probably differ on the details. I'd be much more satisfied with a realistic test. I might be convinced that specific tweaks would make the test more sensitive than the target scenario, which would be ideal. I'm not sure what the complexity of fixing #7980 is, but it seems more of a structural limitation than a functional one. I'm keenly interested in seeing these tests come to life. If it helps, I can provide a tool to drive load for time series testing. Collect important stress profiles for regression analysis done by jenkins - Key: CASSANDRA-8503 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503 Project: Cassandra Issue Type: Task Reporter: Ryan McGuire Assignee: Ryan McGuire Attachments: inmemory.yaml, ycsb.yaml We have a weekly job setup on CassCI to run a performance benchmark against the dev branches as well as the last stable releases. Here's an example: http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f This test is currently pretty basic, it's running on three nodes, with a the default stress profile. We should crowdsource a collection of stress profiles to run, and then once we have many of these tests running we can collect them all into a weekly email. Ideas: * Timeseries (Can this be done with stress? not sure) * compact storage * compression off * ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins
[ https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253096#comment-14253096 ] Jason Brown commented on CASSANDRA-8503: [~enigmacurry] Sure, I think that attests yaml is rather reasonable, at least in terms of the the data model. Feel free to muck with the tunings and write/read statements - I mainly just slapped that part together and hoped like hell it accomplished what I needed :). Collect important stress profiles for regression analysis done by jenkins - Key: CASSANDRA-8503 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503 Project: Cassandra Issue Type: Task Reporter: Ryan McGuire Assignee: Ryan McGuire Attachments: inmemory.yaml, ycsb.yaml We have a weekly job setup on CassCI to run a performance benchmark against the dev branches as well as the last stable releases. Here's an example: http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f This test is currently pretty basic, it's running on three nodes, with a the default stress profile. We should crowdsource a collection of stress profiles to run, and then once we have many of these tests running we can collect them all into a weekly email. Ideas: * Timeseries (Can this be done with stress? not sure) * compact storage * compression off * ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins
[ https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250091#comment-14250091 ] Ryan McGuire commented on CASSANDRA-8503: - [~slebresne] [~thobbs] [~benedict] [~tjake] [~brandon.williams] please comment, do you have stress profiles, or ideas to contriubte? Collect important stress profiles for regression analysis done by jenkins - Key: CASSANDRA-8503 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503 Project: Cassandra Issue Type: Task Reporter: Ryan McGuire Assignee: Ryan McGuire We have a weekly job setup on CassCI to run a performance benchmark against the dev branches as well as the last stable releases. Here's an exacmple: http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f This test is currently pretty basic, it's running on three nodes, with a the default stress profile. We should crowdsource a collection of stress profiles to run, and then once we have many of these tests running we can collect them all into a weekly email. Ideas: * Timeseries (Can this be done with stress? not sure) * compact storage * compression off * ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins
[ https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250097#comment-14250097 ] Ryan McGuire commented on CASSANDRA-8503: - [~jasobrown] I have this stress profile that [~iamaleksey] forwarded me from you, think it's worthwhile to run? http://enigmacurry.com/tmp/abtests.yaml Collect important stress profiles for regression analysis done by jenkins - Key: CASSANDRA-8503 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503 Project: Cassandra Issue Type: Task Reporter: Ryan McGuire Assignee: Ryan McGuire We have a weekly job setup on CassCI to run a performance benchmark against the dev branches as well as the last stable releases. Here's an exacmple: http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f This test is currently pretty basic, it's running on three nodes, with a the default stress profile. We should crowdsource a collection of stress profiles to run, and then once we have many of these tests running we can collect them all into a weekly email. Ideas: * Timeseries (Can this be done with stress? not sure) * compact storage * compression off * ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins
[ https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250108#comment-14250108 ] Jonathan Shook commented on CASSANDRA-8503: --- Currently, stress doesn't have the functionality necessary to test time-series beyond an in-memory size. It needs to support monotonically increasing times with no sorting requirements before earnest time-series tests can be run. Collect important stress profiles for regression analysis done by jenkins - Key: CASSANDRA-8503 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503 Project: Cassandra Issue Type: Task Reporter: Ryan McGuire Assignee: Ryan McGuire We have a weekly job setup on CassCI to run a performance benchmark against the dev branches as well as the last stable releases. Here's an example: http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f This test is currently pretty basic, it's running on three nodes, with a the default stress profile. We should crowdsource a collection of stress profiles to run, and then once we have many of these tests running we can collect them all into a weekly email. Ideas: * Timeseries (Can this be done with stress? not sure) * compact storage * compression off * ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins
[ https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250126#comment-14250126 ] Jonathan Ellis commented on CASSANDRA-8503: --- /cc [~aweisberg] Collect important stress profiles for regression analysis done by jenkins - Key: CASSANDRA-8503 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503 Project: Cassandra Issue Type: Task Reporter: Ryan McGuire Assignee: Ryan McGuire We have a weekly job setup on CassCI to run a performance benchmark against the dev branches as well as the last stable releases. Here's an example: http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f This test is currently pretty basic, it's running on three nodes, with a the default stress profile. We should crowdsource a collection of stress profiles to run, and then once we have many of these tests running we can collect them all into a weekly email. Ideas: * Timeseries (Can this be done with stress? not sure) * compact storage * compression off * ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins
[ https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250242#comment-14250242 ] Ariel Weisberg commented on CASSANDRA-8503: --- I think there are two general classes of benchmarks you would run in CI. Representative user workloads, and targeted microbenchmark workloads. Targeted workloads are a huge help during ongoing development because they magnify the impact of regressions from code changes that are harder to notice in representative workloads. They also point to the specific subsystem being benchmarked. I will just cover the microbenchmarks. The full matrix is large so there is an element of wanting ponies, but the reality is that they are all interesting from a preventing performance regressions and understanding the impact of ongoing changes perspective. Benchmark the stress client, so excess server capacity and a single client testing lots of small messages. Lots of large messages. Stuff the servers can answer as fast as possible. The flip side of this workload is the same thing but for the server where you measure how many trivially answerable tiny queries you can shove through a cluster given excess client capacity. Benchmark perfomance of non-prepared statements. Benchmark performance of preparing statements? A full test matrix for data intensive workloads would test read, write, and 50/50, and for a bonus 90/10. Single cell partitions with a small value and a large value, and a range of wide rows (small, medium, large). All 3 compaction strategies with compression on/off. Data intensive workloads also need to run against a spinning rust and SSDs. CQL specific microbenchmarks against specific CQL datatypes. If there are interactions that are important we should capture those. Counters Lightweight transactions The matrix also needs to include different permutations of replication strategies and consistency levels. Maybe we can constrain those variations to parts of the matrix that would best reflect the impact of replication strategies and CL. Probably a subset of the data intensive workloads. Also want a workload targeting the row cache and key cache when everything is cached and when there is a realistic long tail of data not in the cache. For every workload to really get the value you would like a graph for throughput and a graph for latency at some percentile with a data point per revision tested going back to the beginning as well as a 90 day graph. A trend line also helps. Then someone has to be it for monitoring the graphs and poking people when there is an issue. The workflow usually goes something like the monitor tags the author of the suspected bad revision who triages it and either fixes it or hands it off to the correct person. Timeliness is really important because once regressions start stacking it's a pain to know whether you have done what you should to fix it. Collect important stress profiles for regression analysis done by jenkins - Key: CASSANDRA-8503 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503 Project: Cassandra Issue Type: Task Reporter: Ryan McGuire Assignee: Ryan McGuire We have a weekly job setup on CassCI to run a performance benchmark against the dev branches as well as the last stable releases. Here's an example: http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f This test is currently pretty basic, it's running on three nodes, with a the default stress profile. We should crowdsource a collection of stress profiles to run, and then once we have many of these tests running we can collect them all into a weekly email. Ideas: * Timeseries (Can this be done with stress? not sure) * compact storage * compression off * ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins
[ https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250433#comment-14250433 ] T Jake Luciani commented on CASSANDRA-8503: --- We should also run with all these @ RF=3 and QUORUM read/writes Collect important stress profiles for regression analysis done by jenkins - Key: CASSANDRA-8503 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503 Project: Cassandra Issue Type: Task Reporter: Ryan McGuire Assignee: Ryan McGuire Attachments: inmemory.yaml, ycsb.yaml We have a weekly job setup on CassCI to run a performance benchmark against the dev branches as well as the last stable releases. Here's an example: http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f This test is currently pretty basic, it's running on three nodes, with a the default stress profile. We should crowdsource a collection of stress profiles to run, and then once we have many of these tests running we can collect them all into a weekly email. Ideas: * Timeseries (Can this be done with stress? not sure) * compact storage * compression off * ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins
[ https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250811#comment-14250811 ] Benedict commented on CASSANDRA-8503: - [~jshook] I think we can model a timeseries workload sufficiently closely already (it won't be identical, but it will behave the same - no doubt some minor tweaks can make it feel closer), the only problem is the generation of large partitions with only a single clustering column is currently very inefficient. See CASSANDRA-7980. Collect important stress profiles for regression analysis done by jenkins - Key: CASSANDRA-8503 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503 Project: Cassandra Issue Type: Task Reporter: Ryan McGuire Assignee: Ryan McGuire Attachments: inmemory.yaml, ycsb.yaml We have a weekly job setup on CassCI to run a performance benchmark against the dev branches as well as the last stable releases. Here's an example: http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f This test is currently pretty basic, it's running on three nodes, with a the default stress profile. We should crowdsource a collection of stress profiles to run, and then once we have many of these tests running we can collect them all into a weekly email. Ideas: * Timeseries (Can this be done with stress? not sure) * compact storage * compression off * ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)