#11740: reading integers from a file takes quadratic time
--------------------------------+-------------------------------------------
Reporter: zimmerma | Owner: AlexGhitza
Type: defect | Status: new
Priority: major | Milestone: sage-4.7.2
Component: basic arithmetic | Keywords:
Work_issues: | Upstream: N/A
Reviewer: | Author:
Merged: | Dependencies:
--------------------------------+-------------------------------------------
Comment(by nbruin):
Replying to [comment:7 leif]:
> Do we have direct conversion routines Python long <-> `mpz_t` / `mpn_t`?
It looks like we do:
{{{
sage: for j in range(10):
....: s = '1'*(100000*2^j)
....: t = cputime()
....: si = Integer(int(eval("".join(["Integer('",s,"')"]))))
....: cputime(t)
0.012997999999999621
0.014997000000001037
0.036994999999997447
0.074988000000001165
0.17497399999999885
0.42593499999999906
1.0088469999999994
2.1586709999999982
5.0012399999999992
12.395115000000001
}}}
> (I wouldn't make the Python spkg depend on GMP/MPIR btw.; instead we
might clone / adapt GMP's code in a patch to the Python spkg. No idea how
easy / reasonable that is.)
Yes, that is what I thought as well. However even in GMP/MPIR they only
think it's worth switching algorithm for 4000+ digits. I would not expect
Python maintainers to be very interested in optimizing for such a rare
situation.
There is a much easier workaround, though: Write your integers base
2,4,8,16 (or even 32). The 16 is probably most straightforward, because
literals are automatically recognised when prefixed by "0x". It's a much
more efficient way of storing integers and a 4000 digit number is going to
be meaningless to the human eye anyway. This is really *much* faster:
{{{
sage: for j in range(10):
....: s = "0x"+'1'*(100000*2^j)
....: t = cputime()
....: si = eval(s)
....: cputime(t)
0.0019989999999978636
0.0020000000000095497
0.0019999999999953388
0.0049989999999979773
0.012997999999996068
0.025995999999992137
0.05199300000001017
0.099985000000003765
0.19696999999999321
0.38994000000000995
}}}
So my proposal: Document that if people want to store long numbers, they
should do so in hex. Mention that "Integer(<string>)" is asymptotically
faster that "int(<string>)" and that the preparser turns "<literal>" into
something equivalent to "Integer(int(<string>))".
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/11740#comment:9>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.