Re: committing performance patches without performance numbers

Josh McKenzie Fri, 10 Mar 2017 06:37:31 -0800

The elephant in the room to me is what Ariel said:
>
> What about all the commits that don't intend to have a performance impact
> but do?



On top of that, it's all well and good to microbench the change you make in
a vacuum but what happens when the resource usage implications of that
change have side-effects on the rest of the system?

I've always been a vocal proponent of proving your improvement w/numbers
(be they microbenchmark or stress-based single node, etc), so I'm strongly
+1 on us improving our hygiene on this front.

On Thu, Mar 9, 2017 at 4:56 PM, Ariel Weisberg <ar...@weisberg.ws> wrote:

> Hi,
>
> I should clarify. Should in the sense of it was fine for the process we
> had at the time (ugh!) but it's not what we should do in the future.
>
> Ariel
>
> On Thu, Mar 9, 2017, at 04:55 PM, Ariel Weisberg wrote:
> > Hi,
> >
> > I agree that patch could have used it and it was amenable to
> > micro-benchmarking. Just to be pedantic about process which is something
> > I approve of to the chagrin of so many.
> >
> > On a completely related note that change also randomly boxes a boolean.
> >
> > Ariel
> >
> > On Thu, Mar 9, 2017, at 03:45 PM, Jonathan Haddad wrote:
> > > I don't expect everyone to run a 500 node cluster off to the side to
> test
> > > their patches, but at least some indication that the contributor
> started
> > > Cassandra on their laptop would be a good sign.  The JIRA I referenced
> > > was
> > > an optimization around List, Set and Map serialization.  Would it
> really
> > > have been that crazy to run a handful of benchmarks locally and post
> > > those
> > > results?
> > >
> > > On Thu, Mar 9, 2017 at 12:26 PM Ariel Weisberg <ar...@weisberg.ws>
> wrote:
> > >
> > > > Hi,
> > > >
> > > > I think there are issues around the availability of hardware
> sufficient
> > > > to demonstrate the performance concerns under test. It's an open
> source
> > > > project without centralized infrastructure. A lot of performance
> > > > contributions come from people running C* in production. They are
> > > > already running these changes and have seen the improvement, but
> > > > communicating that to the outside world in a convincing way can be
> hard.
> > > >
> > > > Right now we don't even have performance in continuous integration. I
> > > > think we are putting the cart before the horse in that respect. What
> > > > about all the commits that don't intend to have a performance impact
> but
> > > > do? Even if we had performance metrics in CI who is going to triage
> the
> > > > results religiously?
> > > >
> > > > We also to my knowledge don't have benchmarks for key functionality
> in
> > > > cassandra-stress. Can cassandra-stress benchmark CAS? My
> recollection is
> > > > every time I looked is that it wasn't there.
> > > >
> > > > We can only set the bar as high as contributors are able to meet.
> > > > Certainly if they can't justify why they can't benchmark the thing
> they
> > > > want to contribute then reviewers should make them go and benchmark
> it.
> > > >
> > > > Regards,
> > > > Ariel
> > > >
> > > > On Thu, Mar 9, 2017, at 03:11 PM, Jeff Jirsa wrote:
> > > > > Agree. Anything that's meant to increase performance should
> demonstrate
> > > > > it
> > > > > actually does that. We have microbench available in recent
> versions -
> > > > > writing a new microbenchmark isn't all that onerous. Would be
> great if we
> > > > > had perf tests included in the normal testall/dtest workflow for
> ALL
> > > > > patches so we could quickly spot regressions, but that gets pretty
> > > > > expensive in terms of running long enough tests to actually see
> most
> > > > > common
> > > > > code paths.
> > > > >
> > > > >
> > > > > On Thu, Mar 9, 2017 at 12:00 PM, Jonathan Haddad <
> j...@jonhaddad.com>
> > > > > wrote:
> > > > >
> > > > > > I'd like to discuss what I consider to be a pretty important
> matter -
> > > > > > patches which are written for the sole purpose of improving
> performance
> > > > > > without including a single performance benchmark in the JIRA.
> > > > > >
> > > > > > My original email was in "Testing and Jira Tickets", i'll copy
> it here
> > > > > > for posterity:
> > > > > >
> > > > > > If you don't mind, I'd like to broaden the discussion a little
> bit to
> > > > also
> > > > > > discuss performance related patches.  For instance,
> CASSANDRA-13271
> > > > was a
> > > > > > performance / optimization related patch that included *zero*
> > > > information
> > > > > > on if there was any perf improvement or a regression as a result
> of the
> > > > > > change, even though I've asked twice for that information.
> > > > > >
> > > > > > In addition to "does this thing break anything" we should be
> asking
> > > > "how
> > > > > > does this patch affect performance?" (and were the appropriate
> docs
> > > > > > included, but that's another topic altogether)
> > > > > >
> > > > > > There's a minor note about perf related stuff here:
> > > > > > http://cassandra.apache.org/doc/latest/development/
> > > > > > testing.html#performance-testing
> > > > > >
> > > > > >
> > > > > > "Performance tests for Cassandra are a special breed of tests
> that are
> > > > not
> > > > > > part of the usual patch contribution process. In fact you can
> > > > contribute
> > > > > > tons of patches to Cassandra without ever running performance
> tests.
> > > > They
> > > > > > are important however when working on performance improvements,
> as such
> > > > > > improvements must be measurable."
> > > > > >
> > > > > > I think performance numbers aren't just important, but should be
> a
> > > > > > *requirement* to merge a performance patch.
> > > > > >
> > > > > > Thoughts?
> > > > > > Jon
> > > > > >
> > > >
>

Re: committing performance patches without performance numbers

Reply via email to