[sage-support] Re: performance of applying a map to an integer vector

Felix Breuer Wed, 20 Nov 2013 11:02:21 -0800

Hi Maarten,

a couple of other questions regarding Cython, based on the blog post you 
linked:


* Why are MPIR integer operations slower (according to the post) than 
Python's own big integer operations? What additional functionality does 
MPIR provide that incurs this performance penalty?

* Most of my integers certainly fit into a C int. However it may be that a 
few of them are actually big integers. The blog post mentions that even 
"maybe big integers" are already orders of magnitude slower than plain 
ints. So, what is a best practice to use in a situation where actually big 
integers are very few and far between, but they do exist?

Thanks,
Felix

On Wednesday, November 20, 2013 7:27:11 PM UTC+1, Felix Breuer wrote:
>
> Hi Simon, hi Maarten,
>
> thank you for your answers!
>
> @Simon: 
>
> Yes, I figured that calling vector(...) would impose these kinds of 
> overhead. This is why I am using apply_map in the code, in the hope of 
> saving some of this work. 
>
> Thanks to your suggestion, I tried specifying the base ring in the call to 
> apply_map. This already reduced the overhead of prim_v by about 3 seconds, 
> but apply_map is still the slowest part of the process.
>
> @Maarten:
>
> Yes, I do really simple things lots and lots of times. So your advice of 
> turning to cython (or, as Simon suggested, to tuples instead of vectors) 
> sounds very good. So far, I stuck to vectors in this part of the code 
> mainly because of convenience of notation (overloaded operators).
>
> The one piece of Sage's functionality I really need in this part of the 
> code is arbitrary precision integer arithmetic (including gcd). Do you know 
> how this fits to Cython's types?
>
> Cheers,
> Felix
>
> On Wednesday, November 20, 2013 7:10:06 PM UTC+1, Maarten Derickx wrote:
>>
>> It seems like your code is mostly doing easy integer operations in tight 
>> for loops (although in your case the for loop is hidden in v.apply_map). If 
>> you care about performance in such cases, then you should not use python, 
>> but cython. Because everything you do in python has a small overhead, and 
>> normally this doesn't matter because the overhead is relatively small to 
>> the actual running time of the function. But when doing arithmetic 
>> operations with integers, this overhead is often much and much bigger then 
>> the actual running time. The typical solution for this is using cython 
>> code. For a nice introduction to what cython is and how to use it in sage 
>> you can read:
>>
>> http://sagemath.blogspot.fr/2010/11/cython-sage-and-need-for-speed.html
>>
>> and of course there is the more technical documentation in the developers 
>> guide:
>>
>> http://www.sagemath.org/doc/developer/coding_in_cython.html
>>
>>
>> Le mercredi 20 novembre 2013 11:02:59 UTC+1, Felix Breuer a écrit :
>>>
>>> Hi all!
>>>
>>> I have a large collection (~50,000) of integer vectors in low dimension 
>>> (~20). For each of these vectors, I want to divide all entries by their 
>>> gcd. In other words if
>>>
>>> def prim_v(v):
>>>     d = abs(gcd(v))
>>>     return v.apply_map(lambda vi: vi.divide_knowing_divisible_by(d))
>>>
>>> then I want to compute map(prim_v, V) for a long list V. When trying to 
>>> profile this using %prun, I found it difficult to tell which part of this 
>>> computation is taking the most time. So I rewrote this as
>>>
>>> def foo(v,d):
>>>     return v.divide_knowing_divisible_by(d)
>>>
>>> def bar(v):
>>>     return abs(gcd(v))
>>>
>>> def prim_v(v):
>>>     d = bar(v)
>>>     return v.apply_map(lambda vi: foo(vi,d))
>>>
>>> Then, profiling this with %prun yields the following information.
>>>
>>>  ncalls tottime percall cumtime percall  filename:lineno(function)
>>>   47802   0.119   0.000  13.811   0.000  LDsolver.py:58(prim_v)
>>>   47802   0.116   0.000   3.593   0.000  LDsolver.py:54(bar)
>>>  859898   0.360   0.000   0.524   0.000  LDsolver.py:51(foo)
>>>
>>> If I am reading this correctly, this means that most of the time (~10 
>>> seconds) is not spent doing the actual computation (using foo and bar) but 
>>> simply creating vectors and read/writing values to/from vectors (in 
>>> apply_map). (Do something like return vector((foo(vi,d) for vi in v)) 
>>> instead 
>>> of apply_map does not make much a difference.) 
>>>
>>>
>>> Now, my questions are:
>>>
>>> 1) Why is this so slow? Am I missing something here?
>>> 2) Is there anything I can do to improve performance?
>>>
>>>
>>> Thank you very much,
>>> Felix
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"sage-support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-support.
For more options, visit https://groups.google.com/groups/opt_out.

[sage-support] Re: performance of applying a map to an integer vector

Reply via email to