Hi,

William Stein wrote:
> I just tried these two programs:
> 
> main() {
>     int n = -1, m = 16, k, a=0;
>     for(k=1;k<100000000;k++) {
>       a = n%k + (a<0)?k:0;
>     }
> }

Note that this does not have the proposed semantics.

> and
> 
> main() {
>     int n = -1, m = 16, k, a=0;
>     for(k=1;k<100000000;k++) {
>       a = n%k;
>     }
> }
> 
> and the first takes 22% longer than the second.

Here's what I tried. I added another factor of 10 to the loop to get
readable timings.

------------------------------------
/* cmod.c */
main() {
    int n = -1, m = 16, k, a=0;
    for(k=1;k<1000000000;k++) {
      a += n%k;
      n = -n;
    }
    printf("%i", a);
}

/* py2mod.c */
main() {
    int n = -1, m = 16, k, a=0;
    for(k=1;k<1000000000;k++) {
      int mod = n%k; if (mod<0) mod+=k;
      a += mod;
      n = -n;
    }
    printf("%i", a);
}

/* pymod.c */
main() {
    int n = -1, m = 16, k, a=0;
    for(k=1;k<1000000000;k++) {
      int mod = n%k; mod += (mod<0)*k;
      a += mod;
      n = -n;
    }
    printf("%i", a);
}
----------------------------------

./cmod
0
real    0m3.789s
user    0m3.704s
sys     0m0.000s
./py2mod
-371654657
real    0m4.893s
user    0m4.856s
sys     0m0.004s
./pymod
-371654657
real    0m5.623s
user    0m5.452s
sys     0m0.004s

So, yes, there is a performance difference of up to 30% even for the
fastest (BTW, branching) implementation. For a constant power-of-2 divisor
(m=16), the difference is about 17% for me:

./cmod
-1
real    0m3.316s
user    0m3.268s
sys     0m0.000s
./py2mod
-589934593
real    0m3.880s
user    0m3.868s
sys     0m0.000s
./pymod
-589934593
real    0m4.634s
user    0m4.580s
sys     0m0.000s

When I switch to "unsigned int" for "mod" (i.e. for "unsigned int n=3"), I
end up with about the same numbers as for cmod (signed or unsigned), which
means that gcc drops the if-branch as expected.

Looking at the absolute numbers, a speed penalty of even 30% does not sound
too bad for me. If better performance is really required (i.e. inside a
tight loop), you can switch to "unsigned int" (if you know that you never
use negative values), or use a dedicated helper function in the "cython"
namespace.

Anyway, given that I'm not a heavy numerics user of '%' I guess it's better
to take my weight out of this discussion. The (artificially determined) 30%
might still be too much for some people, when it's the default.

Stefan

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to