Re: committing performance patches without performance numbers

DuyHai Doan Thu, 09 Mar 2017 12:18:50 -0800

After looking at the patch, my thoughts (beware, it's getting very
technical):



Original code:

-            t = new ListType<T>(elements, isMultiCell);
-            ListType<T> t2 = internMap.putIfAbsent(elements, t);
-            t = (t2 == null) ? t : t2;


Optimized code:

+            t = internMap.computeIfAbsent(elements, K -> new
ListType<>(K, isMultiCell) );


This is a very obvious and *classical* optimization. In the original code,
we create an instance of ListType() but it is ONLY usefull if the interMap
doesn't have any value (putIfAbsent). Meaning that the creation of the
ListType object is useless if the map already contains a value.

The optimized code use the computeIfAbsent with a lambda (K -> new
ListType...) which will be evaluated iff the map doesn't contains the key

 For a high performance project like Cassandra, avoiding to create
un-necessary objects to help keeping the heap size small is always a good
idea.

 And this kind of optimization using lambda for lazy evaluation is quite
common (see a PR here on my Achilles project:
https://github.com/doanduyhai/Achilles/commit/9b9c09aa26d0edb8c58222b1ad069327b6f40da3
)

My guess is that the perf improvement and the gain is so obvious for Java
people that the JIRA creator thought a perf benchmark is not necessary

On Thu, Mar 9, 2017 at 9:00 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> I'd like to discuss what I consider to be a pretty important matter -
> patches which are written for the sole purpose of improving performance
> without including a single performance benchmark in the JIRA.
>
> My original email was in "Testing and Jira Tickets", i'll copy it here
> for posterity:
>
> If you don't mind, I'd like to broaden the discussion a little bit to also
> discuss performance related patches.  For instance, CASSANDRA-13271 was a
> performance / optimization related patch that included *zero* information
> on if there was any perf improvement or a regression as a result of the
> change, even though I've asked twice for that information.
>
> In addition to "does this thing break anything" we should be asking "how
> does this patch affect performance?" (and were the appropriate docs
> included, but that's another topic altogether)
>
> There's a minor note about perf related stuff here:
> http://cassandra.apache.org/doc/latest/development/
> testing.html#performance-testing
>
>
> "Performance tests for Cassandra are a special breed of tests that are not
> part of the usual patch contribution process. In fact you can contribute
> tons of patches to Cassandra without ever running performance tests. They
> are important however when working on performance improvements, as such
> improvements must be measurable."
>
> I think performance numbers aren't just important, but should be a
> *requirement* to merge a performance patch.
>
> Thoughts?
> Jon
>

Re: committing performance patches without performance numbers

Reply via email to