unfortunately in that statement, the (e2/e1) power is applied to the
sum of inner terms, so I can't condense it any further than this:
(fabs(f1)**(2/e2) + fabs(f2)**(2/e2))**(e2/e1) + fabs(f3)**(2/e1)
On Wed, Sep 30, 2009 at 1:52 AM, Robert Bradshaw
rober...@math.washington.edu wrote:
On Sep
On Sep 30, 2009, at 1:12 AM, Chris Colbert wrote:
unfortunately in that statement, the (e2/e1) power is applied to the
sum of inner terms, so I can't condense it any further than this:
(fabs(f1)**(2/e2) + fabs(f2)**(2/e2))**(e2/e1) + fabs(f3)**(2/e1)
Ah, I was miss-matching parentheses.
Normally, when I'm hitting a road block with numpy performance, I port
that section to cython and typically see an order of magnitude
increase in speed. (this is without disabling bounds checking) In my
current case, I'm seeing a slowdown of 2x (with boundschecking
disabled) and I'm pretty sure
Chris Colbert wrote:
Normally, when I'm hitting a road block with numpy performance, I port
that section to cython and typically see an order of magnitude
increase in speed. (this is without disabling bounds checking) In my
current case, I'm seeing a slowdown of 2x (with boundschecking
Dag,
HTML was too big to attach to the list email, so I sent it to your
personal email.
Thanks for taking a look!
Chris
On Tue, Sep 29, 2009 at 4:42 PM, Chris Colbert sccolb...@gmail.com wrote:
Dag,
Html is attached, all that garbage at the bottom in commented out via
a docstring (i was
I just verified that the loop is consuming 96% of the execution time...
weird... the loop gets converted to very basic C code... I would think
it would be much faster...
perhaps its my compilation step?
my setup.py looks like this:
# setup.py #
from distutils.core import setup
Chris Colbert skrev:
and within that loop it is these statements that take the bulk of the time:
F = ((f1**2)**(1/e2) + (f2**2)**(1/e2))**(e2/e1) + (f3**2)**(1/e1)
temperr = (C4 * (F**(e1) - 1))**2
and replacing the powers with serial multiplications don't really help any...
Does this
Sturla Molden skrev:
Chris Colbert skrev:
and within that loop it is these statements that take the bulk of the time:
F = ((f1**2)**(1/e2) + (f2**2)**(1/e2))**(e2/e1) + (f3**2)**(1/e1)
temperr = (C4 * (F**(e1) - 1))**2
and replacing the powers with serial multiplications don't really
or better yet, which pow function is numpy using?
On Tue, Sep 29, 2009 at 6:21 PM, Chris Colbert sccolb...@gmail.com wrote:
No, the python ** gets translated to a pow statement by cython.
I think the issue is that for some reason, i'm getting stuck in the
gcc slow pow function
if i let
No, the python ** gets translated to a pow statement by cython.
I think the issue is that for some reason, i'm getting stuck in the
gcc slow pow function
if i let e2 and e1 be 1 and replace f**2 (which would call pow) with f*f,
my execution time drops to this:
1 loops, best of 3: 108 µs
tried that already...
no difference...
On Tue, Sep 29, 2009 at 6:33 PM, Robert Kern robert.k...@gmail.com wrote:
On 2009-09-29 11:21 AM, Chris Colbert wrote:
or better yet, which pow function is numpy using?
double pow(double, double) from math.h .
--
Robert Kern
I have come to
Chris Colbert skrev:
No, the python ** gets translated to a pow statement by cython.
I think the issue is that for some reason, i'm getting stuck in the
gcc slow pow function
if i let e2 and e1 be 1 and replace f**2 (which would call pow) with f*f,
my execution time drops to this:
my big issue here is that these two lines of code, are taking more
time to execute than the entire function as a pure numpy
implementation. And numpy is using the same calls to pow in the
background...
i just dont get itbut i will figure it out...
On Tue, Sep 29, 2009 at 7:05 PM, Sturla
ok, so after playing around with the values of e1 and e2 in these expressions.
F = ((f1**2)**(1/e2) + (f2**2)**(1/e2))**(e2/e1) + (f3**2)**(1/e1)
temperr = (C4 * (F**(e1) - 1))**2
the cython code executes just a fast a numpy, I guess I cant get any
faster because the bottleneck is the calls to
Chris Colbert skrev:
so moral of the story, if you have a numpy script that makes repeated
use of pow you probably wont speed it up with cython, because pow is a
huge bottleneck.
I'd go further than this:
NumPy is written in C and you should not expect replacing NumPy's C with
Cython's
Sturla Molden wrote:
Chris Colbert skrev:
so moral of the story, if you have a numpy script that makes repeated
use of pow you probably wont speed it up with cython, because pow is a
huge bottleneck.
I'd go further than this:
NumPy is written in C and you should not expect replacing
On Sep 29, 2009, at 10:05 AM, Sturla Molden wrote:
Chris Colbert skrev:
No, the python ** gets translated to a pow statement by cython.
I think the issue is that for some reason, i'm getting stuck in the
gcc slow pow function
if i let e2 and e1 be 1 and replace f**2 (which would call
Actually, yes. The exponents are done specifically in that way to
force the sign of the end value.
I suppose i could just call abs() on it though
- Chris
On Tue, Sep 29, 2009 at 8:58 PM, Robert Bradshaw
rober...@math.washington.edu wrote:
On Sep 29, 2009, at 10:05 AM, Sturla Molden
On 2009-09-29 13:19 PM, Dag Sverre Seljebotn wrote:
Sturla Molden wrote:
- The third is NumPy not releasing the GIL, and restricting
multithreaded code to one CPU.
Out of curiosity, does it happen often that NumPy does not release the
GIL? Because of necessity or because it's just not been
Dag Sverre Seljebotn skrev:
Out of curiosity, does it happen often that NumPy does not release the
GIL?
As Robert Kern said, ufuncs release the GIL. The rest of NumPy, does not
(including FFTs and linear algebra).
SciPy more or less never releases the GIL from C or Fortran.
Because of
Sturla Molden skrev:
SciPy more or less never releases the GIL from C or Fortran.
Which is also both from necissity and not bothering. Necissity e.g.
because SciPy's FFTPACK called in a way that prevents it from being
re-entrant.
It is a sad the GIL is not released pervasively in NumPy
On 2009-09-29 17:05 PM, Sturla Molden wrote:
Sturla Molden skrev:
SciPy more or less never releases the GIL from C or Fortran.
Which is also both from necissity and not bothering. Necissity e.g.
because SciPy's FFTPACK called in a way that prevents it from being
re-entrant.
It is a sad
Chris Colbert wrote:
and within that loop it is these statements that take the bulk of the time:
F = ((f1**2)**(1/e2) + (f2**2)**(1/e2))**(e2/e1) + (f3**2)**(1/e1)
temperr = (C4 * (F**(e1) - 1))**2
^^
This variable doesn't appear anywhere in the quoted
Cython code. Did you
Chris Colbert wrote:
over 6x improvement just by avoid a few measly pow statements...
anyone know why i'm stuck in slowpow?
Using pow to calculate x**2 is always going to be slower
than x*x, no matter how smart the pow function is. Function
calls have some overhead, even in C.
--
Greg
Chris Colbert wrote:
my big issue here is that these two lines of code, are taking more
time to execute than the entire function as a pure numpy
implementation. And numpy is using the same calls to pow in the
background...
Are you sure about that? Or is it noticing that you're
raising
On Sep 29, 2009, at 12:03 PM, Chris Colbert wrote:
Actually, yes. The exponents are done specifically in that way to
force the sign of the end value.
I suppose i could just call abs() on it though
I'm sure that abs would be faster than pow, though it's not the
squaring that's
On Sep 29, 2009, at 4:36 PM, Greg Ewing wrote:
Chris Colbert wrote:
over 6x improvement just by avoid a few measly pow statements...
anyone know why i'm stuck in slowpow?
Using pow to calculate x**2 is always going to be slower
than x*x, no matter how smart the pow function is. Function
Robert Kern skrev:
We eagerly await your patches.
I have provided one for NumPy's FFT, lfilter and ckdtree. Nothing gets
into svn, so I don't see why I should bother.
___
Cython-dev mailing list
Cython-dev@codespeak.net
28 matches
Mail list logo