#11740: reading integers from a file takes quadratic time
--------------------------------+-------------------------------------------
   Reporter:  zimmerma          |          Owner:  AlexGhitza
       Type:  defect            |         Status:  new       
   Priority:  major             |      Milestone:  sage-4.7.2
  Component:  basic arithmetic  |       Keywords:            
Work_issues:                    |       Upstream:  N/A       
   Reviewer:                    |         Author:            
     Merged:                    |   Dependencies:            
--------------------------------+-------------------------------------------

Comment(by nbruin):

 Replying to [comment:7 leif]:
 > Do we have direct conversion routines Python long <-> `mpz_t` / `mpn_t`?
 It looks like we do:
 {{{
 sage: for j in range(10):
 ....:        s = '1'*(100000*2^j)
 ....:        t = cputime()
 ....:        si = Integer(int(eval("".join(["Integer('",s,"')"]))))
 ....:        cputime(t)
 0.012997999999999621
 0.014997000000001037
 0.036994999999997447
 0.074988000000001165
 0.17497399999999885
 0.42593499999999906
 1.0088469999999994
 2.1586709999999982
 5.0012399999999992
 12.395115000000001
 }}}
 > (I wouldn't make the Python spkg depend on GMP/MPIR btw.; instead we
 might clone / adapt GMP's code in a patch to the Python spkg. No idea how
 easy / reasonable that is.)

 Yes, that is what I thought as well. However even in GMP/MPIR they only
 think it's worth switching algorithm for 4000+ digits. I would not expect
 Python maintainers to be very interested in optimizing for such a rare
 situation.

 There is a much easier workaround, though: Write your integers base
 2,4,8,16 (or even 32). The 16 is probably most straightforward, because
 literals are automatically recognised when prefixed by "0x". It's a much
 more efficient way of storing integers and a 4000 digit number is going to
 be meaningless to the human eye anyway. This is really *much* faster:
 {{{
 sage: for j in range(10):
 ....:        s = "0x"+'1'*(100000*2^j)
 ....:        t = cputime()
 ....:        si = eval(s)
 ....:        cputime(t)
 0.0019989999999978636
 0.0020000000000095497
 0.0019999999999953388
 0.0049989999999979773
 0.012997999999996068
 0.025995999999992137
 0.05199300000001017
 0.099985000000003765
 0.19696999999999321
 0.38994000000000995
 }}}
 So my proposal: Document that if people want to store long numbers, they
 should do so in hex. Mention that "Integer(<string>)" is asymptotically
 faster that "int(<string>)" and that the preparser turns "<literal>" into
 something equivalent to "Integer(int(<string>))".

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/11740#comment:9>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en.

Reply via email to