The elephant in the room to me is what Ariel said: > > What about all the commits that don't intend to have a performance impact > but do?
On top of that, it's all well and good to microbench the change you make in a vacuum but what happens when the resource usage implications of that change have side-effects on the rest of the system? I've always been a vocal proponent of proving your improvement w/numbers (be they microbenchmark or stress-based single node, etc), so I'm strongly +1 on us improving our hygiene on this front. On Thu, Mar 9, 2017 at 4:56 PM, Ariel Weisberg <ar...@weisberg.ws> wrote: > Hi, > > I should clarify. Should in the sense of it was fine for the process we > had at the time (ugh!) but it's not what we should do in the future. > > Ariel > > On Thu, Mar 9, 2017, at 04:55 PM, Ariel Weisberg wrote: > > Hi, > > > > I agree that patch could have used it and it was amenable to > > micro-benchmarking. Just to be pedantic about process which is something > > I approve of to the chagrin of so many. > > > > On a completely related note that change also randomly boxes a boolean. > > > > Ariel > > > > On Thu, Mar 9, 2017, at 03:45 PM, Jonathan Haddad wrote: > > > I don't expect everyone to run a 500 node cluster off to the side to > test > > > their patches, but at least some indication that the contributor > started > > > Cassandra on their laptop would be a good sign. The JIRA I referenced > > > was > > > an optimization around List, Set and Map serialization. Would it > really > > > have been that crazy to run a handful of benchmarks locally and post > > > those > > > results? > > > > > > On Thu, Mar 9, 2017 at 12:26 PM Ariel Weisberg <ar...@weisberg.ws> > wrote: > > > > > > > Hi, > > > > > > > > I think there are issues around the availability of hardware > sufficient > > > > to demonstrate the performance concerns under test. It's an open > source > > > > project without centralized infrastructure. A lot of performance > > > > contributions come from people running C* in production. They are > > > > already running these changes and have seen the improvement, but > > > > communicating that to the outside world in a convincing way can be > hard. > > > > > > > > Right now we don't even have performance in continuous integration. I > > > > think we are putting the cart before the horse in that respect. What > > > > about all the commits that don't intend to have a performance impact > but > > > > do? Even if we had performance metrics in CI who is going to triage > the > > > > results religiously? > > > > > > > > We also to my knowledge don't have benchmarks for key functionality > in > > > > cassandra-stress. Can cassandra-stress benchmark CAS? My > recollection is > > > > every time I looked is that it wasn't there. > > > > > > > > We can only set the bar as high as contributors are able to meet. > > > > Certainly if they can't justify why they can't benchmark the thing > they > > > > want to contribute then reviewers should make them go and benchmark > it. > > > > > > > > Regards, > > > > Ariel > > > > > > > > On Thu, Mar 9, 2017, at 03:11 PM, Jeff Jirsa wrote: > > > > > Agree. Anything that's meant to increase performance should > demonstrate > > > > > it > > > > > actually does that. We have microbench available in recent > versions - > > > > > writing a new microbenchmark isn't all that onerous. Would be > great if we > > > > > had perf tests included in the normal testall/dtest workflow for > ALL > > > > > patches so we could quickly spot regressions, but that gets pretty > > > > > expensive in terms of running long enough tests to actually see > most > > > > > common > > > > > code paths. > > > > > > > > > > > > > > > On Thu, Mar 9, 2017 at 12:00 PM, Jonathan Haddad < > j...@jonhaddad.com> > > > > > wrote: > > > > > > > > > > > I'd like to discuss what I consider to be a pretty important > matter - > > > > > > patches which are written for the sole purpose of improving > performance > > > > > > without including a single performance benchmark in the JIRA. > > > > > > > > > > > > My original email was in "Testing and Jira Tickets", i'll copy > it here > > > > > > for posterity: > > > > > > > > > > > > If you don't mind, I'd like to broaden the discussion a little > bit to > > > > also > > > > > > discuss performance related patches. For instance, > CASSANDRA-13271 > > > > was a > > > > > > performance / optimization related patch that included *zero* > > > > information > > > > > > on if there was any perf improvement or a regression as a result > of the > > > > > > change, even though I've asked twice for that information. > > > > > > > > > > > > In addition to "does this thing break anything" we should be > asking > > > > "how > > > > > > does this patch affect performance?" (and were the appropriate > docs > > > > > > included, but that's another topic altogether) > > > > > > > > > > > > There's a minor note about perf related stuff here: > > > > > > http://cassandra.apache.org/doc/latest/development/ > > > > > > testing.html#performance-testing > > > > > > > > > > > > > > > > > > "Performance tests for Cassandra are a special breed of tests > that are > > > > not > > > > > > part of the usual patch contribution process. In fact you can > > > > contribute > > > > > > tons of patches to Cassandra without ever running performance > tests. > > > > They > > > > > > are important however when working on performance improvements, > as such > > > > > > improvements must be measurable." > > > > > > > > > > > > I think performance numbers aren't just important, but should be > a > > > > > > *requirement* to merge a performance patch. > > > > > > > > > > > > Thoughts? > > > > > > Jon > > > > > > > > > > >