Re: [Cython] some advice on this module regarding performance?

2009-09-30 Thread Chris Colbert
unfortunately in that statement, the (e2/e1) power is applied to the sum of inner terms, so I can't condense it any further than this: (fabs(f1)**(2/e2) + fabs(f2)**(2/e2))**(e2/e1) + fabs(f3)**(2/e1) On Wed, Sep 30, 2009 at 1:52 AM, Robert Bradshaw rober...@math.washington.edu wrote: On Sep

Re: [Cython] some advice on this module regarding performance?

2009-09-30 Thread Robert Bradshaw
On Sep 30, 2009, at 1:12 AM, Chris Colbert wrote: unfortunately in that statement, the (e2/e1) power is applied to the sum of inner terms, so I can't condense it any further than this: (fabs(f1)**(2/e2) + fabs(f2)**(2/e2))**(e2/e1) + fabs(f3)**(2/e1) Ah, I was miss-matching parentheses.

[Cython] some advice on this module regarding performance?

2009-09-29 Thread Chris Colbert
Normally, when I'm hitting a road block with numpy performance, I port that section to cython and typically see an order of magnitude increase in speed. (this is without disabling bounds checking) In my current case, I'm seeing a slowdown of 2x (with boundschecking disabled) and I'm pretty sure

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Dag Sverre Seljebotn
Chris Colbert wrote: Normally, when I'm hitting a road block with numpy performance, I port that section to cython and typically see an order of magnitude increase in speed. (this is without disabling bounds checking) In my current case, I'm seeing a slowdown of 2x (with boundschecking

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Chris Colbert
Dag, HTML was too big to attach to the list email, so I sent it to your personal email. Thanks for taking a look! Chris On Tue, Sep 29, 2009 at 4:42 PM, Chris Colbert sccolb...@gmail.com wrote: Dag, Html is attached, all that garbage at the bottom in commented out via a docstring (i was

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Chris Colbert
I just verified that the loop is consuming 96% of the execution time... weird... the loop gets converted to very basic C code... I would think it would be much faster... perhaps its my compilation step? my setup.py looks like this: # setup.py # from distutils.core import setup

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Sturla Molden
Chris Colbert skrev: and within that loop it is these statements that take the bulk of the time: F = ((f1**2)**(1/e2) + (f2**2)**(1/e2))**(e2/e1) + (f3**2)**(1/e1) temperr = (C4 * (F**(e1) - 1))**2 and replacing the powers with serial multiplications don't really help any... Does this

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Sturla Molden
Sturla Molden skrev: Chris Colbert skrev: and within that loop it is these statements that take the bulk of the time: F = ((f1**2)**(1/e2) + (f2**2)**(1/e2))**(e2/e1) + (f3**2)**(1/e1) temperr = (C4 * (F**(e1) - 1))**2 and replacing the powers with serial multiplications don't really

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Chris Colbert
or better yet, which pow function is numpy using? On Tue, Sep 29, 2009 at 6:21 PM, Chris Colbert sccolb...@gmail.com wrote: No, the python ** gets translated to a pow statement by cython. I think the issue is that for some reason, i'm getting stuck in the gcc slow pow function if i let

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Chris Colbert
No, the python ** gets translated to a pow statement by cython. I think the issue is that for some reason, i'm getting stuck in the gcc slow pow function if i let e2 and e1 be 1 and replace f**2 (which would call pow) with f*f, my execution time drops to this: 1 loops, best of 3: 108 µs

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Chris Colbert
tried that already... no difference... On Tue, Sep 29, 2009 at 6:33 PM, Robert Kern robert.k...@gmail.com wrote: On 2009-09-29 11:21 AM, Chris Colbert wrote: or better yet, which pow function is numpy using? double pow(double, double) from math.h . -- Robert Kern I have come to

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Sturla Molden
Chris Colbert skrev: No, the python ** gets translated to a pow statement by cython. I think the issue is that for some reason, i'm getting stuck in the gcc slow pow function if i let e2 and e1 be 1 and replace f**2 (which would call pow) with f*f, my execution time drops to this:

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Chris Colbert
my big issue here is that these two lines of code, are taking more time to execute than the entire function as a pure numpy implementation. And numpy is using the same calls to pow in the background... i just dont get itbut i will figure it out... On Tue, Sep 29, 2009 at 7:05 PM, Sturla

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Chris Colbert
ok, so after playing around with the values of e1 and e2 in these expressions. F = ((f1**2)**(1/e2) + (f2**2)**(1/e2))**(e2/e1) + (f3**2)**(1/e1) temperr = (C4 * (F**(e1) - 1))**2 the cython code executes just a fast a numpy, I guess I cant get any faster because the bottleneck is the calls to

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Sturla Molden
Chris Colbert skrev: so moral of the story, if you have a numpy script that makes repeated use of pow you probably wont speed it up with cython, because pow is a huge bottleneck. I'd go further than this: NumPy is written in C and you should not expect replacing NumPy's C with Cython's

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Dag Sverre Seljebotn
Sturla Molden wrote: Chris Colbert skrev: so moral of the story, if you have a numpy script that makes repeated use of pow you probably wont speed it up with cython, because pow is a huge bottleneck. I'd go further than this: NumPy is written in C and you should not expect replacing

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Robert Bradshaw
On Sep 29, 2009, at 10:05 AM, Sturla Molden wrote: Chris Colbert skrev: No, the python ** gets translated to a pow statement by cython. I think the issue is that for some reason, i'm getting stuck in the gcc slow pow function if i let e2 and e1 be 1 and replace f**2 (which would call

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Chris Colbert
Actually, yes. The exponents are done specifically in that way to force the sign of the end value. I suppose i could just call abs() on it though - Chris On Tue, Sep 29, 2009 at 8:58 PM, Robert Bradshaw rober...@math.washington.edu wrote: On Sep 29, 2009, at 10:05 AM, Sturla Molden

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Robert Kern
On 2009-09-29 13:19 PM, Dag Sverre Seljebotn wrote: Sturla Molden wrote: - The third is NumPy not releasing the GIL, and restricting multithreaded code to one CPU. Out of curiosity, does it happen often that NumPy does not release the GIL? Because of necessity or because it's just not been

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Sturla Molden
Dag Sverre Seljebotn skrev: Out of curiosity, does it happen often that NumPy does not release the GIL? As Robert Kern said, ufuncs release the GIL. The rest of NumPy, does not (including FFTs and linear algebra). SciPy more or less never releases the GIL from C or Fortran. Because of

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Sturla Molden
Sturla Molden skrev: SciPy more or less never releases the GIL from C or Fortran. Which is also both from necissity and not bothering. Necissity e.g. because SciPy's FFTPACK called in a way that prevents it from being re-entrant. It is a sad the GIL is not released pervasively in NumPy

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Robert Kern
On 2009-09-29 17:05 PM, Sturla Molden wrote: Sturla Molden skrev: SciPy more or less never releases the GIL from C or Fortran. Which is also both from necissity and not bothering. Necissity e.g. because SciPy's FFTPACK called in a way that prevents it from being re-entrant. It is a sad

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Greg Ewing
Chris Colbert wrote: and within that loop it is these statements that take the bulk of the time: F = ((f1**2)**(1/e2) + (f2**2)**(1/e2))**(e2/e1) + (f3**2)**(1/e1) temperr = (C4 * (F**(e1) - 1))**2 ^^ This variable doesn't appear anywhere in the quoted Cython code. Did you

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Greg Ewing
Chris Colbert wrote: over 6x improvement just by avoid a few measly pow statements... anyone know why i'm stuck in slowpow? Using pow to calculate x**2 is always going to be slower than x*x, no matter how smart the pow function is. Function calls have some overhead, even in C. -- Greg

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Greg Ewing
Chris Colbert wrote: my big issue here is that these two lines of code, are taking more time to execute than the entire function as a pure numpy implementation. And numpy is using the same calls to pow in the background... Are you sure about that? Or is it noticing that you're raising

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Robert Bradshaw
On Sep 29, 2009, at 12:03 PM, Chris Colbert wrote: Actually, yes. The exponents are done specifically in that way to force the sign of the end value. I suppose i could just call abs() on it though I'm sure that abs would be faster than pow, though it's not the squaring that's

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Robert Bradshaw
On Sep 29, 2009, at 4:36 PM, Greg Ewing wrote: Chris Colbert wrote: over 6x improvement just by avoid a few measly pow statements... anyone know why i'm stuck in slowpow? Using pow to calculate x**2 is always going to be slower than x*x, no matter how smart the pow function is. Function

Re: [Cython] some advice on this module regarding performance?

2009-09-29 Thread Sturla Molden
Robert Kern skrev: We eagerly await your patches. I have provided one for NumPy's FFT, lfilter and ckdtree. Nothing gets into svn, so I don't see why I should bother. ___ Cython-dev mailing list Cython-dev@codespeak.net