[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-19 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253264#comment-14253264
 ] 

Benedict commented on CASSANDRA-8503:
-

bq.  as a reason to not do

Not at all. I am suggesting we take a timeseries data model and workload and 
see if we can represent it in a way that is functionally equivalent for the 
tests we will be performing (which, note, will be short lived tests so DTCS and 
other compaction considerations won't be so important) with stress' current 
facilities. It may not *look* identical, but it could be functionally 
equivalent all the same. If it comes up short at being able to represent it, we 
can introduce some small tweaks to make it much closer still (such as using the 
seed to bound the timestamp generation for PK and clustering columns, at which 
point I think the only possible limitation would be certain kinds of query 
slicing behaviours). 

The main impediment is 7980. I'm not sure what you mean about structural vs 
functional. It's not a dramatically difficult feature to add, it's just 
allocating the time to do so.

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire
 Attachments: inmemory.yaml, ycsb.yaml


 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-18 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252297#comment-14252297
 ] 

Benedict commented on CASSANDRA-8503:
-

I would quite like to see us also generate some profiles and cluster configs 
specifically designed to elicit failure scenarios. Perhaps in conjunction with 
some stress enhancements specifically designed to prevent it knocking a server 
over, but keep it under as much load as possible to tease out unexpected 
behaviour.

For instance random interleaving of all operation types (including range 
queries, counters, etc) with more compaction (smaller LCS tables, smaller 
preemptive reopen interval). Then over time expand this testing to include 
periodically taking down servers, bootstrap new ones, etc. it would be great to 
occupy our test clusters with these tests automatically during any unused 
performance testing time. It seems without this we lack robust acceptance 
testing. 

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire
 Attachments: inmemory.yaml, ycsb.yaml


 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-18 Thread Jonathan Shook (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252767#comment-14252767
 ] 

Jonathan Shook commented on CASSANDRA-8503:
---

[~benedict]
Unless I am mistaken, you are describing an issue in the data generation of 
stress as a reason to not do a meaningful time series sanity check in CI. My 
gut feeling on what might qualify as 'sufficiently close' is that we probably 
differ on the details. I'd be much more satisfied with a realistic test. I 
might be convinced that specific tweaks would make the test more sensitive than 
the target scenario, which would be ideal.

I'm not sure what the complexity of fixing #7980 is, but it seems more of a 
structural limitation than a functional one. I'm keenly interested in seeing 
these tests come to life. If it helps, I can provide a tool to drive load for 
time series testing. 


 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire
 Attachments: inmemory.yaml, ycsb.yaml


 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-18 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253096#comment-14253096
 ] 

Jason Brown commented on CASSANDRA-8503:


[~enigmacurry] Sure, I think that attests yaml is rather reasonable, at least 
in terms of the the data model. Feel free to muck with the tunings and 
write/read statements - I mainly just slapped that part together and hoped like 
hell it accomplished what I needed :). 

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire
 Attachments: inmemory.yaml, ycsb.yaml


 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250091#comment-14250091
 ] 

Ryan McGuire commented on CASSANDRA-8503:
-

[~slebresne] [~thobbs] [~benedict] [~tjake] [~brandon.williams] please comment, 
do you have stress profiles, or ideas to contriubte?

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire

 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an exacmple:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250097#comment-14250097
 ] 

Ryan McGuire commented on CASSANDRA-8503:
-

[~jasobrown] I have this stress profile that [~iamaleksey] forwarded me from 
you, think it's worthwhile to run? 

http://enigmacurry.com/tmp/abtests.yaml

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire

 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an exacmple:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread Jonathan Shook (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250108#comment-14250108
 ] 

Jonathan Shook commented on CASSANDRA-8503:
---

Currently, stress doesn't have the functionality necessary to test time-series 
beyond an in-memory size. It needs to support monotonically increasing times 
with no sorting requirements before earnest time-series tests can be run.

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire

 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250126#comment-14250126
 ] 

Jonathan Ellis commented on CASSANDRA-8503:
---

/cc [~aweisberg]

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire

 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250242#comment-14250242
 ] 

Ariel Weisberg commented on CASSANDRA-8503:
---

I think there are two general classes of benchmarks you would run in CI. 
Representative user workloads, and targeted microbenchmark workloads. Targeted 
workloads are a huge help during ongoing development because they magnify the 
impact of regressions from code changes that are harder to notice in 
representative workloads. They also point to the specific subsystem being 
benchmarked.

I will just cover the microbenchmarks. The full matrix is large so there is an 
element of wanting ponies, but the reality is that they are all interesting 
from a preventing performance regressions and understanding the impact of 
ongoing changes perspective.

Benchmark the stress client, so excess server capacity and a single client 
testing lots of small messages. Lots of large messages. Stuff the servers can 
answer as fast as possible. The flip side of this workload is the same thing 
but for the server where you measure how many trivially answerable tiny queries 
you can shove through a cluster given excess client capacity.

Benchmark perfomance of non-prepared statements.

Benchmark performance of preparing statements?
 
A full test matrix for data intensive workloads would test read, write, and 
50/50, and for a bonus 90/10. Single cell partitions with a small value and a 
large value, and a range of wide rows (small, medium, large). All 3 compaction 
strategies with compression on/off. Data intensive workloads also need to run 
against a spinning rust and SSDs.

CQL specific microbenchmarks against specific CQL datatypes. If there are 
interactions that are important we should capture those.

Counters

Lightweight transactions

The matrix also needs to include different permutations of replication 
strategies and consistency levels. Maybe we can constrain those variations to 
parts of the matrix that would best reflect the impact of replication 
strategies and CL. Probably a subset of the data intensive workloads.

Also want a workload targeting the row cache and key cache when everything is 
cached and when there is a realistic long tail of data not in the cache.

For every workload to really get the value you would like a graph for 
throughput and a graph for latency at some percentile with a data point per 
revision tested going back to the beginning as well as a 90 day graph. A trend 
line also helps. Then someone has to be it for monitoring the graphs and poking 
people when there is an issue.

The workflow usually goes something like the monitor tags the author of the 
suspected bad revision who triages it and either fixes it or hands it off to 
the correct person. Timeliness is really important because once regressions 
start stacking it's a pain to know whether you have done what you should to fix 
it.

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire

 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250433#comment-14250433
 ] 

T Jake Luciani commented on CASSANDRA-8503:
---

We should also run with all these @ RF=3 and QUORUM read/writes 

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire
 Attachments: inmemory.yaml, ycsb.yaml


 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250811#comment-14250811
 ] 

Benedict commented on CASSANDRA-8503:
-

[~jshook] I think we can model a timeseries workload sufficiently closely 
already (it won't be identical, but it will behave the same - no doubt some 
minor tweaks can make it feel closer), the only problem is the generation of 
large partitions with only a single clustering column is currently very 
inefficient. See CASSANDRA-7980.

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire
 Attachments: inmemory.yaml, ycsb.yaml


 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)