I don't expect everyone to run a 500 node cluster off to the side to test their patches, but at least some indication that the contributor started Cassandra on their laptop would be a good sign. The JIRA I referenced was an optimization around List, Set and Map serialization. Would it really have been that crazy to run a handful of benchmarks locally and post those results?
On Thu, Mar 9, 2017 at 12:26 PM Ariel Weisberg <ar...@weisberg.ws> wrote: > Hi, > > I think there are issues around the availability of hardware sufficient > to demonstrate the performance concerns under test. It's an open source > project without centralized infrastructure. A lot of performance > contributions come from people running C* in production. They are > already running these changes and have seen the improvement, but > communicating that to the outside world in a convincing way can be hard. > > Right now we don't even have performance in continuous integration. I > think we are putting the cart before the horse in that respect. What > about all the commits that don't intend to have a performance impact but > do? Even if we had performance metrics in CI who is going to triage the > results religiously? > > We also to my knowledge don't have benchmarks for key functionality in > cassandra-stress. Can cassandra-stress benchmark CAS? My recollection is > every time I looked is that it wasn't there. > > We can only set the bar as high as contributors are able to meet. > Certainly if they can't justify why they can't benchmark the thing they > want to contribute then reviewers should make them go and benchmark it. > > Regards, > Ariel > > On Thu, Mar 9, 2017, at 03:11 PM, Jeff Jirsa wrote: > > Agree. Anything that's meant to increase performance should demonstrate > > it > > actually does that. We have microbench available in recent versions - > > writing a new microbenchmark isn't all that onerous. Would be great if we > > had perf tests included in the normal testall/dtest workflow for ALL > > patches so we could quickly spot regressions, but that gets pretty > > expensive in terms of running long enough tests to actually see most > > common > > code paths. > > > > > > On Thu, Mar 9, 2017 at 12:00 PM, Jonathan Haddad <j...@jonhaddad.com> > > wrote: > > > > > I'd like to discuss what I consider to be a pretty important matter - > > > patches which are written for the sole purpose of improving performance > > > without including a single performance benchmark in the JIRA. > > > > > > My original email was in "Testing and Jira Tickets", i'll copy it here > > > for posterity: > > > > > > If you don't mind, I'd like to broaden the discussion a little bit to > also > > > discuss performance related patches. For instance, CASSANDRA-13271 > was a > > > performance / optimization related patch that included *zero* > information > > > on if there was any perf improvement or a regression as a result of the > > > change, even though I've asked twice for that information. > > > > > > In addition to "does this thing break anything" we should be asking > "how > > > does this patch affect performance?" (and were the appropriate docs > > > included, but that's another topic altogether) > > > > > > There's a minor note about perf related stuff here: > > > http://cassandra.apache.org/doc/latest/development/ > > > testing.html#performance-testing > > > > > > > > > "Performance tests for Cassandra are a special breed of tests that are > not > > > part of the usual patch contribution process. In fact you can > contribute > > > tons of patches to Cassandra without ever running performance tests. > They > > > are important however when working on performance improvements, as such > > > improvements must be measurable." > > > > > > I think performance numbers aren't just important, but should be a > > > *requirement* to merge a performance patch. > > > > > > Thoughts? > > > Jon > > > >