Note that

 <int>L[i][<int>rows[i]] + j %w == 0:

would probably be just (or nearly) as fast as

 ((<int>(<tuple>L[i])[<int>(rows[i])])+j %w)==0

If you're going to be dealing with arrays of ints you might want to
look into NumPy and/or memory views for even more speed.


On Thu, Mar 13, 2014 at 7:58 PM, Georgios Tzanakis <[email protected]> wrote:
> Hi Simon,
>
> I really appreciate your thorough answer! Indeed there was a bug and I had
> to do a couple of changes
> to the code, but I understood a lot of things about how to use Cython and
> was able to use it properly
> and have improvements. On top of that, I didn't know about the timeit
> function which is a life saver.
>
> Everything is clear.. Thank you, sir!
>
> Best,
> George
>
>
> On Thu, Mar 13, 2014 at 8:48 AM, Simon King <[email protected]> wrote:
>>
>> Hi!
>>
>> On 2014-03-12, geo909 <[email protected]> wrote:
>> > But I'm still not sure how to use things properly. So, for instance, is
>> > the
>> > following optimization reasonable?
>> > (there is an ~30% increase in speed from pure python code)
>>
>> It is easy to get more.
>>
>> But first: Is there a bug in your code?
>>
>> You write
>>         if all( [(L[i][rows[i]]+j %w)==0] ):
>> Thus, the argument to "all" is a list with precisely one item.
>>
>> If it is not a bug, then you should replace it with
>>        if (L[i][rows[i]]+j%w)==0:
>>
>> I assume that it is not a bug, and thus I used this improvement in all my
>> attempts that I describe below.
>>
>> > # L: A list of tuples of positive integers, each having a couple of
>> > hundred
>> > elements.
>> > # L itself has length at most 3 or 4.
>> >
>> > # e: A tuple of integers. e has length no more than a couple of hundred.
>> > # w a small integer
>>
>> Since there is frequent access to the items of L and e, you should tell
>> Cython
>> that L is a list and that e is a tuple.
>>
>> Also, itertools.product yields tuples, so, "rows" in your function is a
>> tuple. Again, there is frequent acces to the items, thus, you should
>> declare that rows is a tuple.
>>
>> On the other hand, commonzeros is accessed at most a couple of times,
>> thus, no need to make it "cdef int".
>>
>> But it seems to me that the most important line is (after removing the
>> needless "all") this:
>>     if (L[i][rows[i]]+j %w)==0:
>>
>> Let's try to be particularly careful here, since it occurs in an inner
>> loop, and the annotation appears dark yellow.
>>
>> The items of L are tuples. Thus, one could do
>>    (<tuple>L[i])[rows[i]]+j
>> to make access to the tuple items faster.
>>
>> Furthermore, the items in the tuple L[i] are ints, and we want to add it
>> with an int. Hence,
>>    if ((<int>(<tuple>L[i])[rows[i]])+j %w)==0:
>> will make it faster (actually, inserting the <int> makes the execution
>> time drop to 50% compared with a version that only has <tuple>).
>>
>> With
>>   L = [tuple([randint(1,10^8) for i in range(400)]),
>>     tuple([randint(1,10^8) for i in range(300)]), tuple([randint(1,10^8)
>>     for i in range(500)]), tuple([randint(1,10^8) for i in range(200)])]
>>   e = tuple([randint(1,10^8) for i in range(350)])
>>   w = 5
>>
>> and a pure Python version of your function (where I have replaced the
>> "all(...)" as indicated above), I get
>>   sage: timeit("myfunction(L,e,w)")
>>   5 loops, best of 3: 1.11 s per loop
>>
>> However, when cythoning your function as follows
>> {{{
>> %cython
>> import itertools
>> def myfunction(list L, tuple e, int w):
>>     cdef int lenL = len(L)
>>     cdef int i,j
>>     cdef tuple rows
>>     for rows in itertools.product(range(w), repeat=lenL):
>>         commonzeros=0
>>         for j in e:
>>             for i in range(lenL):
>>                 if ((<int>(<tuple>L[i])[<int>(rows[i])])+j %w)==0:
>>                     commonzeros+=1
>>                     if commonzeros==4:
>>                         return(1)
>>     return(0)
>> }}}
>> I get
>>   sage: timeit("myfunction(L,e,w)")
>>   5 loops, best of 3: 18.6 ms per loop
>>
>> If you now look at the annotated version of the function, you'll see that
>>   for rows in itertools.product
>> remains dark yellow.
>>
>> So, if one wanted to optimise further, one should try to improve that.
>> Since you iterate over len(L) copes of range(w) (rather than over the
>> product of lists of different size), it should be not too difficult to
>> write a custom iterator in Cython.
>>
>> But perhaps the speedup (111 ms --> 18.6 ms) is good enough for you?
>>
>> Best regards,
>> Simon
>>
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "sage-support" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/sage-support/S9eXmSVoo9E/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> [email protected].
>>
>> To post to this group, send email to [email protected].
>> Visit this group at http://groups.google.com/group/sage-support.
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "sage-support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/sage-support.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"sage-support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-support.
For more options, visit https://groups.google.com/d/optout.

Reply via email to