[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-23 Thread Stefan Krah

Changes by Stefan Krah :


--
nosy: +steven.daprano

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-23 Thread Stefan Krah

Stefan Krah added the comment:

I guess there's some version mixup here:  From Python 3.3 on
the integrated C version of decimal does not store the digits
as a string and does not have the private _int method.

--
nosy: +skrah

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-23 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

May be implement the as_integer_ratio() method and/or numerator and denominator 
attributes in the Decimal class?

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-23 Thread Stefan Krah

Changes by Stefan Krah :


--
versions:  -Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-23 Thread John Walker

John Walker added the comment:

> I guess there's some version mixup here:  From Python 3.3 on
> the integrated C version of decimal does not store the digits
> as a string and does not have the private _int method.

Stefan, _int is a slot in Lib/_pydecimal.py. It should be defined on python 3.5 
and tip, unsure about other versions.

Python 3.5.1 (default, Dec  7 2015, 12:58:09) 
[GCC 5.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from decimal import Decimal
>>> Decimal("100.00")
Decimal('100.00')
>>> Decimal("100.00")._int
'1'

--
versions: +Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-23 Thread Stefan Krah

Stefan Krah added the comment:

On Wed, Dec 23, 2015 at 09:01:22PM +, John Walker wrote:
> Stefan, _int is a slot in Lib/_pydecimal.py. It should be defined on python 
> 3.5 and tip, unsure about other versions.
> 
> Python 3.5.1 (default, Dec  7 2015, 12:58:09) 
> [GCC 5.2.0] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from decimal import Decimal
> >>> Decimal("100.00")
> Decimal('100.00')
> >>> Decimal("100.00")._int
> '1'

That should only happen if the C version did not build for some reason:

Python 3.6.0a0 (default:323c10701e5d, Dec 14 2015, 14:28:41) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from decimal import Decimal
>>> Decimal("100.00")._int
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'decimal.Decimal' object has no attribute '_int'
>>>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-23 Thread John Walker

John Walker added the comment:

> No, the regular build uses the libmpdec that is shipped with
> Python.  The external libmpdec.so only comes into play if you
> compile --with-system-libmpdec.

Oh, OK. I see whats happening. My distro deletes the shipped version and 
compiles --with-system-libmpdec. We're on the same page now, thanks.

https://projects.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/python

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-23 Thread Stefan Krah

Stefan Krah added the comment:

No, the regular build uses the libmpdec that is shipped with
Python.  The external libmpdec.so only comes into play if you
compile --with-system-libmpdec.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-23 Thread John Walker

John Walker added the comment:

> That should only happen if the C version did not build for some reason:

Ahh, gotcha. I assume one instance where this happens is when the machine 
doesn't have libmpdec.so

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-23 Thread Steven D'Aprano

Steven D'Aprano added the comment:

On Wed, Dec 23, 2015 at 03:18:28PM +, Serhiy Storchaka wrote:
> May be implement the as_integer_ratio() method and/or numerator and 
> denominator attributes in the Decimal class?

That would also be good as it will decrease the API differences between 
floats and Decimals and make it easier to duck-type one for the other.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-22 Thread John Walker

John Walker added the comment:

Heres the output of running the benchmark on my machine:

Testing proposed implementation
number = 1
0.07098613299967838
number = 10
0.6952260910002224
number = 100
6.948197601999709
Testing current implementation
number = 1
0.141816276996
number = 10
1.350394603001405
number = 100
13.625065807000283

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-22 Thread John Walker

Changes by John Walker :


--
type:  -> performance

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25928] Improve performance of statistics._decimal_to_ratio and fractions.from_decimal

2015-12-22 Thread John Walker

New submission from John Walker:

In statistics, there is a FIXME on Line 250 above _decimal_to_ratio that says:

# FIXME This is faster than Fraction.from_decimal, but still too slow.

Half of the time is spent in a conversion in d.as_tuple(). Decimal internally 
stores the digits as a string, but in d.as_tuple(), the digits are individually 
cast to integers and returned as a tuple of integers.

This is OK, but _decimal_to_ratio undoes the work that was done in d.as_tuple() 
by adding them all back into an integer. A similar, but slightly different 
approach is taken in Fractions.from_decimal, where the tuple is cast into a 
string and then parsed into an integer. We can be a lot faster if we use the 
_int instance variable directly.

In the case of _decimal_to_ratio, the new code seems to be twice as fast with 
usage _decimal_to_ratio(Decimal(str(random.random(:


def _decimal_to_ratio(d):
sign, exp = d._sign, d._exp
if exp in ('F', 'n', 'N'):  # INF, NAN, sNAN
assert not d.is_finite()
return (d, None)
num = int(d._int)
if exp < 0:
den = 10**-exp
else:
num *= 10**exp
den = 1
if sign:
num = -num
return (num, den)

If the performance improvement is considered worthwhile, here are a few 
solutions I see.

1) Use _int directly in fractions and statistics.

2) Add a digits_as_str method to Decimal. This prevents exposing _int as an 
implementation detail, and makes sense to me since I suspect there is a lot of 
code casting the tuple of int to a string anyway. 

3) Add a parameter to as_tuple for determining whether digits should be 
returned as a string or a tuple.

4) Deprecate _int in Decimal and add a new reference str_digits.

There are probably more solutions. I lean towards 4, because it makes usage 
easier and avoids cluttering Decimal with methods. 

Here is what I used for benchmarks:



import timeit

old_setup = """
import random
from decimal import Decimal

def _decimal_to_ratio(d):
sign, digits, exp = d.as_tuple()
if exp in ('F', 'n', 'N'):  # INF, NAN, sNAN
assert not d.is_finite()
return (d, None)
num = 0
for digit in digits:
num = num*10 + digit
if exp < 0:
den = 10**-exp
else:
num *= 10**exp
den = 1
if sign:
num = -num
return (num, den)

def run_it():
dec = Decimal(str(random.random()))
_decimal_to_ratio(dec)
"""

new_setup = """
import random
from decimal import Decimal

def _decimal_to_ratio(d):
sign, exp = d._sign, d._exp
if exp in ('F', 'n', 'N'):  # INF, NAN, sNAN
assert not d.is_finite()
return (d, None)
num = int(d._int)
if exp < 0:
den = 10**-exp
else:
num *= 10**exp
den = 1
if sign:
num = -num
return (num, den)

def run_it():
dec = Decimal(str(random.random()))
_decimal_to_ratio(dec)
"""

if __name__ == '__main__':
print("Testing proposed implementation")
print("number = 1")
print(timeit.Timer(stmt='run_it()', setup=new_setup).timeit(number=1))
print("number = 10") 
print(timeit.Timer(stmt='run_it()', setup=new_setup).timeit(number=10))
print("number = 100") 
print(timeit.Timer(stmt='run_it()', setup=new_setup).timeit(number=100))

print("Testing old implementation")
print("number = 1")
print(timeit.Timer(stmt='run_it()', setup=old_setup).timeit(number=1))
print("number = 10") 
print(timeit.Timer(stmt='run_it()', setup=old_setup).timeit(number=10))
print("number = 100") 
print(timeit.Timer(stmt='run_it()', setup=old_setup).timeit(number=100))

--
components: Library (Lib)
messages: 256873
nosy: johnwalker
priority: normal
severity: normal
status: open
title: Improve performance of statistics._decimal_to_ratio and 
fractions.from_decimal
versions: Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com