On Mon, Mar 16, 2009 at 6:28 AM, Stefan Behnel <[email protected]> wrote:
> So, yes, there is a performance difference of up to 30% even for the
> fastest (BTW, branching) implementation. For a constant power-of-2 divisor
> (m=16), the difference is about 17% for me:
>
> ./cmod
> -1
> real    0m3.316s
> user    0m3.268s
> sys     0m0.000s
> ./py2mod
> -589934593
> real    0m3.880s
> user    0m3.868s
> sys     0m0.000s
> ./pymod
> -589934593
> real    0m4.634s
> user    0m4.580s
> sys     0m0.000s

But note that for constant power-of-2 divisors, getting Python
semantics by using a bitmask is actually faster than getting C
semantics:

cwi...@magnetar:/tmp$ time ./cmod
-1
real    0m1.875s
user    0m1.872s
sys     0m0.004s
cwi...@magnetar:/tmp$ time ./py3mod
-589934593
real    0m1.214s
user    0m1.212s
sys     0m0.000s

This is because Python semantics is a single AND instruction, but C
semantics uses AND plus some fixups to get the negative result.  (So
your pymod and py2mod above are using an AND, subtracting to get the
negative result, and then adding to undo the subtractions.)

Carl
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to