OK, I finally ran some simple benchmarks. To summarize, it looks like  
the non-naive algorithm is 4 times slower on my computer for doubles  
(see the code I used below).

-------- complex.pyx ----------

def time_div(double complex z, int N):
     cdef double complex a = 1
     cdef int i
     for i from 0 <= i < N:
         a = a/z
     return a

------ C code (toggled manually) --------

     static INLINE %(type)s %(type_name)s_div_fast(%(type)s a, %(type) 
s b) {
         %(type)s z;
         %(real_type)s denom = b.real*b.real + b.imag*b.imag;
         z.real = (a.real * b.real + a.imag * b.imag) / denom;
         z.imag = (a.imag * b.real - a.real * b.imag) / denom;
         return z;
     }

     static INLINE %(type)s %(type_name)s_div_slow(%(type)s a, %(type) 
s b) {
         %(type)s z;
         const %(real_type)s abs_breal = fabs(b.real);
         const %(real_type)s abs_bimag = fabs(b.imag);
         if (abs_breal >= abs_bimag) {
             const %(real_type)s ratio = b.imag / b.real;
             const %(real_type)s denom = b.real + b.imag * ratio;
             z.real = (a.real + a.imag * ratio) / denom;
             z.imag = (a.imag - a.real * ratio) / denom;
         }
         else {
             const %(real_type)s ratio = b.real / b.imag;
             const %(real_type)s denom = b.real * ratio + b.imag;
             z.real = (a.real * ratio + a.imag) / denom;
             z.imag = (a.imag * ratio - a.real) / denom;
         }
         return z;
     }

     #define %(type_name)s_div %(type_name)s_div_slow

------- session --------

In [1]: from complex import time_div; import math; a = math.e**1j
In [2]: time abs(time_div(a, 10**7))
CPU times: user 0.10 s, sys: 0.00 s, total: 0.10 s
Wall time: 0.10 s
Out[3]: 1.0000000002480893

vs.

In [1]: from complex import time_div; import math; a = math.e**1j
In [2]: time abs(time_div(a, 10**7))
CPU times: user 0.40 s, sys: 0.00 s, total: 0.40 s
Wall time: 0.40 s
Out[3]: 1.0000000000360101

------------------------------------

The difference in accumulated error after ten million divisions is:

sage: (1 - 1.0000000002480893) / (1 - 1.0000000000360101)
6.8894367097208654

 From before, relative errors for the naive vs. other algorithm,  
100000 runs, uniformly chosen in unit square (though nearly all  
distributions look basically the same):

naive
     better 26187
     avg 1.4940916064705992895601085724e-16
     worst 5.7659574333851360909896621025e-16
other
     better 63116
     avg 9.5414951065097745299683547276e-17
     worst 3.9584519821557590591785765975e-16

- Robert

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to