New submission from Tim Peters <t...@python.org>:

Our internal base conversion algorithms between power-of-2 and non-power-of-2 
bases are quadratic time, and that's been annoying forever ;-) This applies to 
int<->str and int<->decimal.Decimal conversions. Sometimes the conversion is 
implicit, like when comparing an int to a Decimal.

For example:

>>> a = 1 << 1000000000 # yup! a billion and one bits
>>> s = str(a)

I gave up after waiting for over 8 hours, and the computation apparently can't 
be interrupted.

In contrast, using the function in the attached todecstr.py gets the result in 
under a minute:

>>> a = 1 << 1000000000
>>> s = todecstr(a)
>>> len(s)
301029996

That builds an equal decimal.Decimal in a "clever" recursive way, and then just 
applies str to _that_.

That's actually a best case for the function, which gets major benefit from the 
mountains of trailing 0 bits. A worst case is all 1-bits, but that still 
finishes in under 5 minutes:

>>> a = 1 << 1000000000
>>> s2 = todecstr(a - 1)
>>> len(s2)
301029996
>>> s[-10:], s2[-10:]
('1787109376', '1787109375')

A similar kind of function could certainly be written to convert from Decimal 
to int much faster, but it would probably be less effective. These things avoid 
explicit division entirely, but fat multiplies are key, and Decimal implements 
a fancier * algorithm than Karatsuba.

Not for the faint of heart ;-)

----------
components: Interpreter Core
files: todecstr.py
messages: 411962
nosy: tim.peters
priority: normal
severity: normal
status: open
title: Quadratic time internal base conversions
type: performance
Added file: https://bugs.python.org/file50593/todecstr.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue46558>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to