Tim Peters <t...@python.org> added the comment:

The test case here is a = (1 << 100000000) - 1, a solid string of 100 million 1 
bits. The goal is to convert to a decimal string.

Methods:

native: str(a)

numeral: the Python numeral() function from bpo-3451's div.py after adapting to 
use the Python divmod_fast() from the same report's fast_div.py.

todecstr: from the Python file attached to this report.

gmp: str() applied to gmpy2.mpz(a).

Timings:

native: don't know; gave up after waiting over 2 1/2 hours.
numeral: about 5 1/2 minutes.
todecstr: under 30 seconds. (*)
gmp: under 6 seconds.

So there's room for improvement ;-)

But here's the thing: I've lost count of how many times someone has whipped up 
a pure-Python implementation of a bigint algorithm that leaves CPython in the 
dust. And they're generally pretty easy in Python.

But then they die there, because converting to C is soul-crushing, losing the 
beauty and elegance and compactness to mountains of low-level details of 
memory-management, refcounting, and checking for errors after every tiny 
operation.

So a new question in this endless dilemma: _why_ do we need to convert to C? 
Why not leave the extreme cases to far-easier to write and maintain Python 
code? When we're cutting runtime from hours down to minutes, we're focusing on 
entirely the wrong end to not settle for 2 minutes because it may be 
theoretically possible to cut that to 1 minute by resorting to C.


(*) I hope this algorithm tickles you by defying expectations ;-) It 
essentially stands `numeral()` on its head by splitting the input by a power of 
2 instead of by a power of 10. As a result _no_ divisions are used. But instead 
of shifting decimal digits into place, it has to multiply the high-end pieces 
by powers of 2. That seems insane on the face of it, but hard to argue with the 
clock ;-) The "tricks" here are that the O(log log n) powers of 2 needed can be 
computed efficiently in advance of any splitting, and that all the heavy 
arithmetic is done _in_ the `decimal` module, which implements 
fancier-than-Karatsuba multiplication and whose values can be converted to 
decimal strings very quickly.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue46558>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to