I don't expect everyone to run a 500 node cluster off to the side to test
their patches, but at least some indication that the contributor started
Cassandra on their laptop would be a good sign.  The JIRA I referenced was
an optimization around List, Set and Map serialization.  Would it really
have been that crazy to run a handful of benchmarks locally and post those
results?

On Thu, Mar 9, 2017 at 12:26 PM Ariel Weisberg <ar...@weisberg.ws> wrote:

> Hi,
>
> I think there are issues around the availability of hardware sufficient
> to demonstrate the performance concerns under test. It's an open source
> project without centralized infrastructure. A lot of performance
> contributions come from people running C* in production. They are
> already running these changes and have seen the improvement, but
> communicating that to the outside world in a convincing way can be hard.
>
> Right now we don't even have performance in continuous integration. I
> think we are putting the cart before the horse in that respect. What
> about all the commits that don't intend to have a performance impact but
> do? Even if we had performance metrics in CI who is going to triage the
> results religiously?
>
> We also to my knowledge don't have benchmarks for key functionality in
> cassandra-stress. Can cassandra-stress benchmark CAS? My recollection is
> every time I looked is that it wasn't there.
>
> We can only set the bar as high as contributors are able to meet.
> Certainly if they can't justify why they can't benchmark the thing they
> want to contribute then reviewers should make them go and benchmark it.
>
> Regards,
> Ariel
>
> On Thu, Mar 9, 2017, at 03:11 PM, Jeff Jirsa wrote:
> > Agree. Anything that's meant to increase performance should demonstrate
> > it
> > actually does that. We have microbench available in recent versions -
> > writing a new microbenchmark isn't all that onerous. Would be great if we
> > had perf tests included in the normal testall/dtest workflow for ALL
> > patches so we could quickly spot regressions, but that gets pretty
> > expensive in terms of running long enough tests to actually see most
> > common
> > code paths.
> >
> >
> > On Thu, Mar 9, 2017 at 12:00 PM, Jonathan Haddad <j...@jonhaddad.com>
> > wrote:
> >
> > > I'd like to discuss what I consider to be a pretty important matter -
> > > patches which are written for the sole purpose of improving performance
> > > without including a single performance benchmark in the JIRA.
> > >
> > > My original email was in "Testing and Jira Tickets", i'll copy it here
> > > for posterity:
> > >
> > > If you don't mind, I'd like to broaden the discussion a little bit to
> also
> > > discuss performance related patches.  For instance, CASSANDRA-13271
> was a
> > > performance / optimization related patch that included *zero*
> information
> > > on if there was any perf improvement or a regression as a result of the
> > > change, even though I've asked twice for that information.
> > >
> > > In addition to "does this thing break anything" we should be asking
> "how
> > > does this patch affect performance?" (and were the appropriate docs
> > > included, but that's another topic altogether)
> > >
> > > There's a minor note about perf related stuff here:
> > > http://cassandra.apache.org/doc/latest/development/
> > > testing.html#performance-testing
> > >
> > >
> > > "Performance tests for Cassandra are a special breed of tests that are
> not
> > > part of the usual patch contribution process. In fact you can
> contribute
> > > tons of patches to Cassandra without ever running performance tests.
> They
> > > are important however when working on performance improvements, as such
> > > improvements must be measurable."
> > >
> > > I think performance numbers aren't just important, but should be a
> > > *requirement* to merge a performance patch.
> > >
> > > Thoughts?
> > > Jon
> > >
>

Reply via email to