Re: [Numpy-discussion] fast_any_all , a trivial but fast/useful helper function for numpy

Graeme B. Bell Thu, 05 Sep 2013 01:48:07 -0700

Hi Robert, 

Thanks for proposing an alternative implementation approach. 
However, did you test your proposal before you made the assertion about its 
behaviour?



>reduce(np.logical_or, inputs, False)
>reduce(np.logical_and, inputs, True)

This code consistently benchmarks 20% slower than the method I use (tested on 
two different machines several times).


>Your fast_logic() is basically reduce().

No, it isn't.


Updated benchmarks for your proposal and also for another alternative 
implemenation using boolean indexing at: 
https://github.com/gbb/numpy-fast-any-all/blob/master/BENCHMARK.md 


Three general points arising from this:

1 - idioms don't have test coverage

Generally, by using idioms rather than functions, you risk mistyping or 
misusing the form of the idiom and thus introducing a bug. You also lose out on 
explicit testing and implicit 'real world testing' that tends to build up 
around library functions.


2 - idioms aren't maintained or updated (and they have a unknown shelf life)

An idiom might be fast today (or not), it may be correct today, but tomorrow is 
unknown. 

A key problem is that the relative performance of the parts of a library like 
numpy will keep changing - sometimes substantially - and idiomatic approaches 
to overcome performance difficulties in the short term tend to become outdated 
and even harmful very quickly. As in this example, they can even be harmful 
from the moment they're written. Browsing a site like stackoverflow should show 
you both new and experienced users often taking inefficient approaches because 
of outdated idiomatic advice. 


3 - idioms are OK, but functions are better, because implementation hiding and 
abstraction are good things. 

If you use a benchmarked/tested function which acknowledges a range of 
alternative implementations, you have a reasonable degree of confidence that 
you're getting the best performance and correct behaviour, because you can 
actually see the effects of the alternative implementations in benchmarks/test 
output. 

It's a lot more sensible to use a function from a publicly available library - 
any library - than to manually maintain a set of idioms and have to continually 
search your software for the idioms, benchmark them to see if they're still 
beneficial, and modify them when they're not. 

Graeme


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] fast_any_all , a trivial but fast/useful helper function for numpy

Reply via email to