Re: [Cython] Semantics of cdef int division and %?

Dag Sverre Seljebotn Mon, 16 Mar 2009 02:22:20 -0700

Robert Bradshaw wrote:
> On Mar 12, 2009, at 2:15 AM, William Stein wrote:
> 
>> On Thu, Mar 12, 2009 at 1:46 AM, Dag Sverre Seljebotn
>> <[email protected]> wrote:
>>>> This could be very useful for debugging things, but it implies
>>>> there's a single, correct way that the % and // operators behave.
>>>>
>>>> The problem is that sometimes I want to run code with Python
>>>> semantics (e.g. I'm quickly cythonizing a file) and sometimes I want
>>>> to run code with C semantics (e.g. I'm doing linear algebra mod p,
>>>> and don't want the overhead of fixing the sign). And perhaps I'm to
>>>> demanding, but I want to be able to use % in both places rather than
>>>> know some obscure function call.
>>>>
>>> Perhaps you could ask the Sage list to get some more input of what  
>>> the
>>> typical expectations to Cython are?
>>>
>>> (Also if you work in Z_p, could you not use unsigned ints? Then you
>>> don't get the overhead? Though I'm avoiding unsigned types like the
>>> plague myself after I figured out that range(-n, n) is empty if n is
>>> unsigned :-))
>> If you compute the sum of x and y mod p by doing "(x+y)%p", then the
>> time is dominated by the % operation, which can easily be nearly 10
>> times longer than +.  Now imagine doing linear algebra, where you take
>> dot products, etc., and do lots of mod p arithmetic.  If you dot two
>> vectors with n entries in the naive way you end up doing n %p's, which
>> is very expensive.  Now imagine that you represent the entries of your
>> vectors as C ints between -p/2 and p/2.  Then you can very often do
>> much of the dot product and only have to do very few mods, since
>> frequently when you add up a bunch of numbers between -(p/2)^2 and
>> (p/2)^2, they are likely to not overflow.  The sum will be close to 0
>> because of cancellation.
> 
> I was about to answer the very same thing then this email came in.  
> Also, as you mentioned unsigned ints have subtle semantics, like the  
> above. Another bad (easy--I've corrected it several times) mistake is  
> when one writes (a-b)%p. If a and b are unsigned, this could be  
> totally wrong. Of course, the main goal is to avoid using any  
> division as much as possible.
> 
>>> Perhaps you could ask the Sage list to get some more input of what  
>>> the
>>> typical expectations to Cython are?
>> I often think of Cython primarily as a "Python compiler", and from
>> that perspective (-1 % 16)  being different in Cython and Python is
>> worrisome, since it will surely lead to subtle bugs.     For me,
>> Cython is all about writing fast code that is easy to use from Python.
>> Cython is not about writing C code; if I want to write C code I write
>> it in C.  Cython is not the C language after all.   Also, I think many
>> current and potential users of Cython (at least through Sage) don't
>> even know C, or if they do, they are using Cython so they don't have
>> to use C.
> 
> Good point. I'm going to start a thread there just to get a straw  
> poll. It's the largest community of Cython users I know of...


I did the same for NumPy. The results are here:

http://thread.gmane.org/gmane.comp.python.numeric.general/28726

(And sage-devel thread is here:

http://groups.google.com/group/sage-devel/browse_thread/thread/be9e2ef6a9745575
)

Both communities seems to mirror the discussion we've already had here 
very closely. There's one group who sees Cython as "fast Python" who 
want Python semantics and another who sees it as "mixed Python and C" 
who want the C semantics.

(BTW NumPy uses the C semantics both for ints and floats, i.e. when 
doing % or // on arrays.)

-- 
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] Semantics of cdef int division and %?

Reply via email to