On Wed, Nov 05, 2014 at 10:02:01PM +0000, Alex Gaynor wrote: > Hey Toni, > > If this optimization is valid for any float, we should definitely do it, > and this is a missed optimization. If it's not valid for all floats, I'm > not sure how we should handle it, if at all.
I don't believe that it is valid for floats apart from exact powers of two. Toni says: > > only powers of two have an exact reciprocal floating point > > representation, but there might be a benefit in trading the least > > significant digit for a more significant speedup. Please don't make that decision for the user. If I want to trade off accuracy for speed, I can write: r = 1/x y*r but if I write y/x, I expect y/x to the full accuracy available. Thanks, Steve > > Alex > > On Wed Nov 05 2014 at 10:16:36 AM Toni Mattis < > toni.mat...@student.hpi.uni-potsdam.de> wrote: > > > Hello, > > > > I discovered that PyPy's JIT generates "DIVSD" instructions on xmm > > registers when dividing a float by a constant C. This consumes an order > > of magnitude more CPU cycles than the corresponding "MULSD" instruction > > with a precomputed 1/C. > > > > I know that only powers of two have an exact reciprocal floating point > > representation, but there might be a benefit in trading the least > > significant digit for a more significant speedup. > > > > So, is this a missed optimization (at least for reasonably accurate > > cases), a present or possibly future option (like -ffast-math in gcc) or > > are there more reasons against it? > > > > > > Thanks, > > > > Toni > > > > > > --- PS: Small Example --- > > > > This function takes on average 0.41 seconds to compute on an > > array.array('d') with 10**8 elements between 0 and 1: > > > > def spikes_div(data, threshold=1.99): > > count = 0 > > for i in data: > > if i / 0.5 > threshold: > > count += 1 > > return count > > > > Rewritten with a multiplication it takes about 0.29 seconds on average, > > speeding it up by factor 1.4: > > > > ... > > if i * 2.0 > threshold: > > ... > > > > > > The traces contain the same instructions (except for the MULSD/DIVSD) > > and run the same number of times. I'm working with a fresh translation > > of the current PyPy default on Ubuntu 14.04 x64 with a 2nd generation > > Core i7 CPU. > > > > > > _______________________________________________ > > pypy-dev mailing list > > pypy-dev@python.org > > https://mail.python.org/mailman/listinfo/pypy-dev > > > _______________________________________________ > pypy-dev mailing list > pypy-dev@python.org > https://mail.python.org/mailman/listinfo/pypy-dev _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev