Hi,
William Stein wrote:
> I just tried these two programs:
>
> main() {
> int n = -1, m = 16, k, a=0;
> for(k=1;k<100000000;k++) {
> a = n%k + (a<0)?k:0;
> }
> }
Note that this does not have the proposed semantics.
> and
>
> main() {
> int n = -1, m = 16, k, a=0;
> for(k=1;k<100000000;k++) {
> a = n%k;
> }
> }
>
> and the first takes 22% longer than the second.
Here's what I tried. I added another factor of 10 to the loop to get
readable timings.
------------------------------------
/* cmod.c */
main() {
int n = -1, m = 16, k, a=0;
for(k=1;k<1000000000;k++) {
a += n%k;
n = -n;
}
printf("%i", a);
}
/* py2mod.c */
main() {
int n = -1, m = 16, k, a=0;
for(k=1;k<1000000000;k++) {
int mod = n%k; if (mod<0) mod+=k;
a += mod;
n = -n;
}
printf("%i", a);
}
/* pymod.c */
main() {
int n = -1, m = 16, k, a=0;
for(k=1;k<1000000000;k++) {
int mod = n%k; mod += (mod<0)*k;
a += mod;
n = -n;
}
printf("%i", a);
}
----------------------------------
./cmod
0
real 0m3.789s
user 0m3.704s
sys 0m0.000s
./py2mod
-371654657
real 0m4.893s
user 0m4.856s
sys 0m0.004s
./pymod
-371654657
real 0m5.623s
user 0m5.452s
sys 0m0.004s
So, yes, there is a performance difference of up to 30% even for the
fastest (BTW, branching) implementation. For a constant power-of-2 divisor
(m=16), the difference is about 17% for me:
./cmod
-1
real 0m3.316s
user 0m3.268s
sys 0m0.000s
./py2mod
-589934593
real 0m3.880s
user 0m3.868s
sys 0m0.000s
./pymod
-589934593
real 0m4.634s
user 0m4.580s
sys 0m0.000s
When I switch to "unsigned int" for "mod" (i.e. for "unsigned int n=3"), I
end up with about the same numbers as for cmod (signed or unsigned), which
means that gcc drops the if-branch as expected.
Looking at the absolute numbers, a speed penalty of even 30% does not sound
too bad for me. If better performance is really required (i.e. inside a
tight loop), you can switch to "unsigned int" (if you know that you never
use negative values), or use a dedicated helper function in the "cython"
namespace.
Anyway, given that I'm not a heavy numerics user of '%' I guess it's better
to take my weight out of this discussion. The (artificially determined) 30%
might still be too much for some people, when it's the default.
Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev