subject:"\[jira\] \[Commented\] \(CASSANDRA\-8503\) Collect important stress profiles for regression analysis done by jenkins"

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-19 Thread Benedict (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253264#comment-14253264
]

Benedict commented on CASSANDRA-8503:
-

bq. as a reason to not do

Not at all. I am suggesting we take a timeseries data model and workload and
see if we can represent it in a way that is functionally equivalent for the
tests we will be performing (which, note, will be short lived tests so DTCS and
other compaction considerations won't be so important) with stress' current
facilities. It may not *look* identical, but it could be functionally
equivalent all the same. If it comes up short at being able to represent it, we
can introduce some small tweaks to make it much closer still (such as using the
seed to bound the timestamp generation for PK and clustering columns, at which
point I think the only possible limitation would be certain kinds of query
slicing behaviours).

The main impediment is 7980. I'm not sure what you mean about structural vs
functional. It's not a dramatically difficult feature to add, it's just
allocating the time to do so.

Collect important stress profiles for regression analysis done by jenkins
-

Key: CASSANDRA-8503
URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
Project: Cassandra
Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire
Attachments: inmemory.yaml, ycsb.yaml

We have a weekly job setup on CassCI to run a performance benchmark against
the dev branches as well as the last stable releases.
Here's an example:
http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
This test is currently pretty basic, it's running on three nodes, with a the
default stress profile. We should crowdsource a collection of stress profiles
to run, and then once we have many of these tests running we can collect them
all into a weekly email.
Ideas:
* Timeseries (Can this be done with stress? not sure)
* compact storage
* compression off
* ...

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-18 Thread Benedict (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252297#comment-14252297
]

Benedict commented on CASSANDRA-8503:
-

I would quite like to see us also generate some profiles and cluster configs
specifically designed to elicit failure scenarios. Perhaps in conjunction with
some stress enhancements specifically designed to prevent it knocking a server
over, but keep it under as much load as possible to tease out unexpected
behaviour.

For instance random interleaving of all operation types (including range
queries, counters, etc) with more compaction (smaller LCS tables, smaller
preemptive reopen interval). Then over time expand this testing to include
periodically taking down servers, bootstrap new ones, etc. it would be great to
occupy our test clusters with these tests automatically during any unused
performance testing time. It seems without this we lack robust acceptance
testing.

Collect important stress profiles for regression analysis done by jenkins
-

Key: CASSANDRA-8503
URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
Project: Cassandra
Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire
Attachments: inmemory.yaml, ycsb.yaml

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-18 Thread Jonathan Shook (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252767#comment-14252767
]

Jonathan Shook commented on CASSANDRA-8503:
---

[~benedict]
Unless I am mistaken, you are describing an issue in the data generation of
stress as a reason to not do a meaningful time series sanity check in CI. My
gut feeling on what might qualify as 'sufficiently close' is that we probably
differ on the details. I'd be much more satisfied with a realistic test. I
might be convinced that specific tweaks would make the test more sensitive than
the target scenario, which would be ideal.

I'm not sure what the complexity of fixing #7980 is, but it seems more of a
structural limitation than a functional one. I'm keenly interested in seeing
these tests come to life. If it helps, I can provide a tool to drive load for
time series testing.

Collect important stress profiles for regression analysis done by jenkins
-

Key: CASSANDRA-8503
URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
Project: Cassandra
Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire
Attachments: inmemory.yaml, ycsb.yaml

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-18 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253096#comment-14253096
 ] 

Jason Brown commented on CASSANDRA-8503:


[~enigmacurry] Sure, I think that attests yaml is rather reasonable, at least 
in terms of the the data model. Feel free to muck with the tunings and 
write/read statements - I mainly just slapped that part together and hoped like 
hell it accomplished what I needed :). 

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire
 Attachments: inmemory.yaml, ycsb.yaml


 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread Ryan McGuire (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250091#comment-14250091
 ] 

Ryan McGuire commented on CASSANDRA-8503:
-

[~slebresne] [~thobbs] [~benedict] [~tjake] [~brandon.williams] please comment, 
do you have stress profiles, or ideas to contriubte?

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire

 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an exacmple:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread Ryan McGuire (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250097#comment-14250097
 ] 

Ryan McGuire commented on CASSANDRA-8503:
-

[~jasobrown] I have this stress profile that [~iamaleksey] forwarded me from 
you, think it's worthwhile to run? 

http://enigmacurry.com/tmp/abtests.yaml

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire

 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an exacmple:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread Jonathan Shook (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250108#comment-14250108
 ] 

Jonathan Shook commented on CASSANDRA-8503:
---

Currently, stress doesn't have the functionality necessary to test time-series 
beyond an in-memory size. It needs to support monotonically increasing times 
with no sorting requirements before earnest time-series tests can be run.

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire

 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250126#comment-14250126
 ] 

Jonathan Ellis commented on CASSANDRA-8503:
---

/cc [~aweisberg]

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire

 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread Ariel Weisberg (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250242#comment-14250242
]

Ariel Weisberg commented on CASSANDRA-8503:
---

I think there are two general classes of benchmarks you would run in CI.
Representative user workloads, and targeted microbenchmark workloads. Targeted
workloads are a huge help during ongoing development because they magnify the
impact of regressions from code changes that are harder to notice in
representative workloads. They also point to the specific subsystem being
benchmarked.

I will just cover the microbenchmarks. The full matrix is large so there is an
element of wanting ponies, but the reality is that they are all interesting
from a preventing performance regressions and understanding the impact of
ongoing changes perspective.

Benchmark the stress client, so excess server capacity and a single client
testing lots of small messages. Lots of large messages. Stuff the servers can
answer as fast as possible. The flip side of this workload is the same thing
but for the server where you measure how many trivially answerable tiny queries
you can shove through a cluster given excess client capacity.

Benchmark perfomance of non-prepared statements.

Benchmark performance of preparing statements?

A full test matrix for data intensive workloads would test read, write, and
50/50, and for a bonus 90/10. Single cell partitions with a small value and a
large value, and a range of wide rows (small, medium, large). All 3 compaction
strategies with compression on/off. Data intensive workloads also need to run
against a spinning rust and SSDs.

CQL specific microbenchmarks against specific CQL datatypes. If there are
interactions that are important we should capture those.

Counters

Lightweight transactions

The matrix also needs to include different permutations of replication
strategies and consistency levels. Maybe we can constrain those variations to
parts of the matrix that would best reflect the impact of replication
strategies and CL. Probably a subset of the data intensive workloads.

Also want a workload targeting the row cache and key cache when everything is
cached and when there is a realistic long tail of data not in the cache.

For every workload to really get the value you would like a graph for
throughput and a graph for latency at some percentile with a data point per
revision tested going back to the beginning as well as a 90 day graph. A trend
line also helps. Then someone has to be it for monitoring the graphs and poking
people when there is an issue.

The workflow usually goes something like the monitor tags the author of the
suspected bad revision who triages it and either fixes it or hands it off to
the correct person. Timeliness is really important because once regressions
start stacking it's a pain to know whether you have done what you should to fix
it.

Collect important stress profiles for regression analysis done by jenkins
-

Key: CASSANDRA-8503
URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
Project: Cassandra
Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread T Jake Luciani (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250433#comment-14250433
 ] 

T Jake Luciani commented on CASSANDRA-8503:
---

We should also run with all these @ RF=3 and QUORUM read/writes 

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire
 Attachments: inmemory.yaml, ycsb.yaml


 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

2014-12-17 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250811#comment-14250811
 ] 

Benedict commented on CASSANDRA-8503:
-

[~jshook] I think we can model a timeseries workload sufficiently closely 
already (it won't be identical, but it will behave the same - no doubt some 
minor tweaks can make it feel closer), the only problem is the generation of 
large partitions with only a single clustering column is currently very 
inefficient. See CASSANDRA-7980.

 Collect important stress profiles for regression analysis done by jenkins
 -

 Key: CASSANDRA-8503
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8503
 Project: Cassandra
  Issue Type: Task
Reporter: Ryan McGuire
Assignee: Ryan McGuire
 Attachments: inmemory.yaml, ycsb.yaml


 We have a weekly job setup on CassCI to run a performance benchmark against 
 the dev branches as well as the last stable releases.
 Here's an example:
 http://cstar.datastax.com/tests/id/8223fe2e-8585-11e4-b0bf-42010af0688f
 This test is currently pretty basic, it's running on three nodes, with a the 
 default stress profile. We should crowdsource a collection of stress profiles 
 to run, and then once we have many of these tests running we can collect them 
 all into a weekly email.
 Ideas:
  * Timeseries (Can this be done with stress? not sure)
  * compact storage
  * compression off
  * ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

[jira] [Commented] (CASSANDRA-8503) Collect important stress profiles for regression analysis done by jenkins

11 matches

Site Navigation

Mail list logo

Footer information