[sage-devel] caching and optimization

William Stein Thu, 30 Nov 2006 18:40:51 -0800

On Thu, 30 Nov 2006 18:04:12 -0800, David Harvey  
<[EMAIL PROTECTED]> wrote:
> On Nov 30, 2006, at 8:50 PM, William Stein wrote:
>>> IntegerModRing now has a method precompute_table() which will create
>>> a cached table of all ring elements which cuts down on the overhead a
>>
>> I've applied your patch and it does indeed speed things up.  Thanks.
>>
>>> lot. Perhaps this should be automatically called in the constructor
>>> on rings of a certain size or less.
>>
>> Yes, it should be automatic.  I did some tests, and making it
>> automatic
>> up to say 10^4 would certainly make sense.  Then it costs at most 0.10
>> seconds and 1MB RAM.
>
> I am starting to get really uneasy about all these cached ring
> elements. SAGE is going to start becoming a real memory hog if we
> keep doing this.
>
> I don't know where you get 1MB from. Is that 1MB per ring? (100 bytes
> per ring element?)


It's less than 1MB per ring, and I got it by simply running the table  
creation
and seeing that the memory usage increased by less than 1MB.

> If it's 1MB per ring, then I only have to be
> working modulo 500 or so integers simultaneously --- not at all a far-
> fetched situation --- before my typical desktop PC runs out of RAM.

We could only automatically cache up to 50 rings, say.   If the user
*wants* caching, they can always explicitly ask for it, and if they
don't they can always explicitly ask not to have it.

   sage: Integers(97, cache=False)
   sage: Integers(97, cache=True)

if you are explicitly about casting then either option should
be respected.  If you don't specify anything for cache, it could
use some rules, such as suggested above.

> Which is of course ridiculous. SAGE must be able to run with < 64 GB
> RAM.

SAGE currently doesn't run on a machine with < 64GB RAM, since when it  
starts PARI
takes 50MB already of RAM for its stack (it has a horrible "memory  
management"
scheme).   It might be possible to reduce this with some tricks, but
nobody has complained or put in effort to do this.

On the other hand, if you create a large number of elements modulo one
integer with the caching you won't run out of memory, whereas without
it you might very well run out.

> I get the feeling that all these caching solutions are treating a
> symptom rather than the underlying problem. The underlying problem is
> that we don't have a decently fast way of creating Python objects in
> Pyrex.

That's like saying "the underlying problem is that we're using
an intepreter".

There are at least two problems:
   (1) Speed of creating elements.  For Robert's benchmark:
http://modular.math.washington.edu:8101/benches
          Without caching: 2.26 second
            With  caching: 1.42 second
   (2) Memory.   If you create 10^6 cached elements, say, then that
       costs only 10^6 pointers.  If you do not cache the elements,
       It will cost you 4-5 times as much space.  So by banning caching
       in order to save memory, you may very well be using way more
       memory in a common use case.

For comparison, PARI/GP takes 1.817 seconds to do the benchmark.

MAGMA takes 0.620:

> b := GF(389)!10;
> time for i in [1..10^6] do c := b*b + b*b; end for;
Time: 0.620

But keep in mind that as a language MAGMA is hugely restrictive --
no cyclic garbage collection, no user defined types, no exception
handling, no eval statement, etc.  And after years of pushing for
MAGMA to have things like user defined types I started finding
out that there is a tradeoff between speed and all those other
nice things.   The same benchmark in C (made a little
harder to avoid compiler optimization) takes 0.01 seconds:

%sagex
def bench(int n):
     cdef int i, b, c
     b = 10
     for i from 0<=i<n:
         c = (c*b + b*b)%389  # "c*b" to avoid compiler optimizations
     return c
sage: time bench(10^6)
362
CPU time: 0.01 s,  Wall time: 0.01 s

CONCLUSIONS:
   (1) Constantly keep in mind ways to make it easier for
       users to write snippets of compiled code!
   (2) In this benchmark SAGE is > 60 times faster than MAGMA.
       REALLY.  I remember vividly many times programming in
       Magma interpreter and wanting to do some inner loop
       faster and simply having no possible way to do it.
   (3) PARI has gp2c, but that's less convenient to use, since
       it operates on whole programs, and you can't just use
       it inline from the interpreter.

> We got some of the way towards solving this problem at SAGE
> Days 2. Much of the work has been done for the Integer class.  I hope
> that someone can put some effort into generalising and improving
> those efforts. I don't have the time right now to work on this.

I pushed that same work through to most other compiled classes.
E.g., Python's own ints with their caching etc., do the benchmark
mentioned above in 0.70 seconds, so now SAGE's object creation isn't
that  far off from Python's own built-in optimized object creation.


William

--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~----------~----~----~----~------~----~------~--~---

[sage-devel] caching and optimization

Reply via email to