[issue27181] Add geometric mean to `statistics` module

2019-04-07 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Feel free to reopen this if something further needed to be changed or discussed.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2019-04-07 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset 6463ba3061bd311413d2951dc83c565907e10459 by Raymond Hettinger in 
branch 'master':
bpo-27181: Add statistics.geometric_mean() (GH-12638)
https://github.com/python/cpython/commit/6463ba3061bd311413d2951dc83c565907e10459


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2019-04-02 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Steven, how does this look?

https://patch-diff.githubusercontent.com/raw/python/cpython/pull/12638.diff

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2019-03-31 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
keywords: +patch
pull_requests: +12570

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2019-03-28 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> On the basis that something is better than nothing, go ahead.
> We can discuss accuracy and speed issues later.

Thanks.  I'll put together a PR for your consideration.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2019-03-28 Thread STINNER Victor


Change by STINNER Victor :


--
nosy:  -vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2019-03-28 Thread Steven D'Aprano


Steven D'Aprano  added the comment:

> In the spirit of "perfect is the enemy of good", would it be 
> reasonable to start with a simple, fast implementation using 
> exp-mean-log?  Then if someone wants to make it more accurate later, 
> they can do so.

I think that is a reasonable idea. On the basis that something is better 
than nothing, go ahead. We can discuss accuracy and speed issues later.

Getting some tricky cases down for reference:

# older (removed) implementation
py> geometric_mean([7]*2)
7.0
py> geometric_mean([7]*15)
7.0

# Raymond's newer (faster) implementation
py> exp(fmean(map(log, [7]*2)))
6.999
py> exp(fmean(map(log, [7]*15)))
6.999

py> geometric_mean([3,27])
9.0
py> geometric_mean([3,27]*5)
9.0

py> exp(fmean(map(log, [3,27])))
9.002
py> exp(fmean(map(log, [3,27]*5)))
8.998

py> x = 2.5e15
py> geometric_mean([x]*100)
2500.0
py> exp(fmean(map(log, [x]*100)))
2499.5

On the other hand, sometimes rounding errors work in our favour:

py> geometric_mean([1e50, 1e-50])  # people might expect 1.0
0.9998
py> 1e-50 == 1/(1e50)  # even though they aren't quite inverses
False

py> exp(fmean(map(log, [1e50, 1e-50])))
1.0

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2019-03-24 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Almost three years have passed.

In the spirit of "perfect is the enemy of good", would it be reasonable to 
start with a simple, fast implementation using exp-mean-log?  Then if someone 
wants to make it more accurate later, they can do so.

In some quick tests, I don't see much of an accuracy loss. It looks to be 
plenty good enough to use as a starting point:

--- Accuracy experiments ---

>>> from decimal import Decimal
>>> from functools import reduce
>>> from operator import mul
>>> from random import expovariate, triangular
>>> from statistics import fmean

>>> # https://www.wolframalpha.com/input/?i=geometric+mean+12,+17,+13,+5,+120,+7
>>> data = [12, 17, 13, 5, 120, 7]
>>> print(reduce(mul, map(Decimal, data)) ** (Decimal(1) / len(data)))
14.94412420173971227234687688
>>> exp(fmean(map(log, map(fabs, data
14.944124201739715

>>> data = [expovariate(50.0) for i in range(1_000)]
>>> print(reduce(mul, map(Decimal, data)) ** (Decimal(1) / len(data)))
0.01140902688569587677205587938
>>> exp(fmean(map(log, map(fabs, data
0.011409026885695879

>>> data = [triangular(2000.0, 3000.0, 2200.0) for i in range(10_000)]
>>> print(reduce(mul, map(Decimal, data)) ** (Decimal(1) / len(data)))
2388.381301718524160840023868
>>> exp(fmean(map(log, map(fabs, data
2388.3813017185225

>>> data = [lognormvariate(20.0, 3.0) for i in range(100_000)]
>>> min(data), max(data)
(2421.506538652375, 137887726484094.5)
>>> print(reduce(mul, map(Decimal, data)) ** (Decimal(1) / len(data)))
484709306.8805352290183838500
>>> exp(fmean(map(log, map(fabs, data
484709306.8805349

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2019-02-16 Thread Ned Deily


Change by Ned Deily :


--
nosy:  -ned.deily

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2019-02-16 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> Updating the version in case this wanted to be considered for 3.8.

Yes.  It would be nice to get this wrapped-up.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2019-02-15 Thread Cheryl Sabella


Cheryl Sabella  added the comment:

Updating the version in case this wanted to be considered for 3.8.

--
versions: +Python 3.8 -Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2017-08-27 Thread Cheryl Sabella

Cheryl Sabella added the comment:

I was wondering if this has been taken up again for 3.7?  Thanks!

--
nosy: +csabella

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-10-04 Thread Ned Deily

Ned Deily added the comment:

Thanks, Steven.  Actually, we needed to remove geometric_mean from the 3.6 
branch, not the default branch (which will become 3.7).  I backported your 
removal patch to 3.6.  Feel free to reapply geometric_mean to the default 
branch at your leisure.

--
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-10-04 Thread Roundup Robot

Roundup Robot added the comment:

New changeset de0fa478c22e by Steven D'Aprano in branch '3.6':
Issue #27181 remove geometric_mean and defer for 3.7.
https://hg.python.org/cpython/rev/de0fa478c22e

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-10-04 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 9dce0e41bedd by Steven D'Aprano in branch 'default':
Issue #27181 remove geometric_mean and defer for 3.7.
https://hg.python.org/cpython/rev/9dce0e41bedd

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-10-04 Thread Steven D'Aprano

Steven D'Aprano added the comment:

I'm sorry to say that due to technical difficulties, geometric mean is not 
going to be in a fit state for beta 2 of 3.6, and so is going to be removed and 
delayed until 3.7.

--
priority: release blocker -> 
versions: +Python 3.7 -Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-09-12 Thread Steven D'Aprano

Steven D'Aprano added the comment:

On Mon, Sep 12, 2016 at 03:35:14PM +, Mark Dickinson wrote:
> statistics.geometric_mean(0.7 for _ in range(5000))

I've raised a new ticket #28111

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-09-12 Thread STINNER Victor

STINNER Victor added the comment:

>>> statistics.geometric_mean([0.7 for _ in range(5000)])
Traceback (most recent call last):
  File "/Users/mdickinson/Python/cpython-git/Lib/statistics.py", line 362, in 
float_nroot
isinfinity = math.isinf(x)
OverflowError: int too large to convert to float

=> see also issue #27975

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-09-12 Thread Mark Dickinson

Mark Dickinson added the comment:

Steven: any thoughts about the

statistics.geometric_mean(0.7 for _ in range(5000))

failure? Should I open a separate bug report for that, or would you rather 
address it as part of this issue?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-09-11 Thread Steven D'Aprano

Steven D'Aprano added the comment:

As discussed with Ned by email, I'm currently unable to build 3.6 and won't 
have time to work on this before b1. As discussed on #27761 my tests here are 
too strict and should be loosened, e.g. from assertEqual to assertAlmostEqual. 
Ned wrote:

"If you are only planning to make changes to the tests themselves, I think that 
can wait for b2."

I have no plans to change the publicly visible interface of geometric_mean.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-17 Thread Mark Dickinson

Mark Dickinson added the comment:

> self.assertEqual(self.nroot(x**12, 12), float(x))
> AssertionError: 1.1865 != 1.1868

That looks like a case where the test should simply be weakened to an 
`assertAlmostEqual` with a suitable tolerance; there's no strong reason to 
expect that `nroot` will give a faithfully rounded result in this case or any 
other.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-17 Thread koobs

koobs added the comment:

For posterity, the following failure was observed on all (9/10/11(current) 
FreeBSD buildbots:

==
FAIL: testFraction (test.test_statistics.Test_Nth_Root)
--
Traceback (most recent call last):
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/test/test_statistics.py",
 line 1247, in testFraction
self.assertEqual(self.nroot(x**12, 12), float(x))
AssertionError: 1.1865 != 1.1868

--
nosy: +koobs

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-16 Thread STINNER Victor

STINNER Victor added the comment:

I would like to use buildbots to check for regressions, but I see a lot of red 
buildbots, so buildbots became useless :-/

I skipped failing test_statistics tests, since failures are known.

I put the priority to "release blocker".

I suggest to either revert the change or find a fix before 3.6b1.

--
priority: normal -> release blocker

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-16 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 54288b160243 by Victor Stinner in branch 'default':
Issue #27181: Skip tests known to fail until a fix is found
https://hg.python.org/cpython/rev/54288b160243

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-16 Thread STINNER Victor

STINNER Victor added the comment:

Failure on s390x Debian 3.x:

http://buildbot.python.org/all/builders/s390x%20Debian%203.x/builds/1455/steps/test/logs/stdio

==
FAIL: testExactPowers (test.test_statistics.Test_Nth_Root) (i=29, n=11)
--
Traceback (most recent call last):
  File 
"/home/dje/cpython-buildarea/3.x.edelsohn-debian-z/build/Lib/test/test_statistics.py",
 line 1216, in testExactPowers
self.assertEqual(self.nroot(x, n), i)
AssertionError: 29.004 != 29

==
FAIL: testExactPowersNegatives (test.test_statistics.Test_Nth_Root) (i=-29, 
n=11)
--
Traceback (most recent call last):
  File 
"/home/dje/cpython-buildarea/3.x.edelsohn-debian-z/build/Lib/test/test_statistics.py",
 line 1228, in testExactPowersNegatives
self.assertEqual(self.nroot(x, n), i)
AssertionError: -29.004 != -29

--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-15 Thread Ned Deily

Ned Deily added the comment:

FTR, multiple platforms are failing in various ways, not just PPC64, so 
Issue27761 was expanded to cover them and has been marked as a "release 
blocker".

--
nosy: +ned.deily

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-14 Thread Mark Dickinson

Mark Dickinson added the comment:

A failing case:

>>> statistics.geometric_mean([0.7 for _ in range(5000)])
Traceback (most recent call last):
  File "/Users/mdickinson/Python/cpython-git/Lib/statistics.py", line 362, in 
float_nroot
isinfinity = math.isinf(x)
OverflowError: int too large to convert to float

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "", line 1, in 
  File "/Users/mdickinson/Python/cpython-git/Lib/statistics.py", line 595, in 
geometric_mean
s = 2**p * _nth_root(2**q, n)
  File "/Users/mdickinson/Python/cpython-git/Lib/statistics.py", line 346, in 
nth_root
return _nroot_NS.float_nroot(x, n)
  File "/Users/mdickinson/Python/cpython-git/Lib/statistics.py", line 364, in 
float_nroot
return _nroot_NS.bignum_nroot(x, n)
  File "/Users/mdickinson/Python/cpython-git/Lib/statistics.py", line 489, in 
bignum_nroot
b = 2**q * _nroot_NS.nroot(2**r, n)
  File "/Users/mdickinson/Python/cpython-git/Lib/statistics.py", line 384, in 
nroot
r1 = math.pow(x, 1.0/n)
OverflowError: int too large to convert to float

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-13 Thread Steven D'Aprano

Steven D'Aprano added the comment:

I've created a new issue to track the loss of accuracy on PowerPC: 
http://bugs.python.org/issue27761

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-12 Thread Mark Dickinson

Mark Dickinson added the comment:

> According to my testing, math.pow(x, 0.5) is no worse than sqrt.

It certainly is worse than sqrt, both in terms of speed and accuracy. Whether 
the difference is enough to make it worth special-casing is another question, 
of course, and as you say, that can happen later.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-12 Thread Steven D'Aprano

Steven D'Aprano added the comment:

I thought about special-casing n=2 to math.sqrt, but as that's an 
implementation detail I can make that change at any time. According 
to my testing, math.pow(x, 0.5) is no worse than sqrt, so I'm not 
sure if there's any advantage to having yet another branch.

I'd be interested in special-casing n=3 to math.cbrt (if and when it exists) 
now that its a standard C99 function.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-12 Thread Mark Dickinson

Mark Dickinson added the comment:

What no patch for pre-commit review?!

For computing nth roots, it may be worth special-casing the case n=2: for 
floats, `math.sqrt` is likely to be faster and more precise than an ad-hoc 
algorithm. (Indeed, I'd expect it to be perfectly correctly rounded on the vast 
majority of current machines.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-11 Thread Martin Panter

Martin Panter added the comment:

Tests fail on a Power PC buildbot:

http://buildbot.python.org/all/builders/PPC64LE%20Fedora%203.x/builds/1476/steps/test/logs/stdio
==
FAIL: testExactPowers (test.test_statistics.Test_Nth_Root) (i=29, n=11)
--
Traceback (most recent call last):
  File 
"/home/shager/cpython-buildarea/3.x.edelsohn-fedora-ppc64le/build/Lib/test/test_statistics.py",
 line 1216, in testExactPowers
self.assertEqual(self.nroot(x, n), i)
AssertionError: 29.004 != 29

==
FAIL: testExactPowersNegatives (test.test_statistics.Test_Nth_Root) (i=-29, 
n=11)
--
Traceback (most recent call last):
  File 
"/home/shager/cpython-buildarea/3.x.edelsohn-fedora-ppc64le/build/Lib/test/test_statistics.py",
 line 1228, in testExactPowersNegatives
self.assertEqual(self.nroot(x, n), i)
AssertionError: -29.004 != -29

--
nosy: +martin.panter

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-09 Thread Ram Rachum

Ram Rachum added the comment:

I meant the mathematical definition.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-09 Thread Steven D'Aprano

Steven D'Aprano added the comment:

On Tue, Aug 09, 2016 at 06:44:22AM +, Ram Rachum wrote:
> For `geometric_mean`, maybe I'd add one sentence that describes
> how the geometric mean is calculated.

What do you mean? As in, the mathematical definition of geometric mean?

Or do you mean a one sentence description of the algorithm?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-09 Thread Ram Rachum

Ram Rachum added the comment:

Also... I like the detailed docstrings with the real-life examples! That stuff 
helps when coding and using an unfamiliar function (since I see the docs in a 
panel of my IDE), so I wish I'd see more detailed docstrings like these ones in 
the standard library. For `geometric_mean`, maybe I'd add one sentence that 
describes how the geometric mean is calculated.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-09 Thread Ram Rachum

Ram Rachum added the comment:

Thanks for the patch Steven! I won't comment about the code because I don't 
know enough about these algorithms, but I'm thinking, since you also did a 
refactoring of the statistics module, maybe these should be two separate 
patches/commits so it'll be easy to see which part is the new feature and which 
part is moving existing code around?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-08-08 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 9eb5edfcf604 by Steven D'Aprano in branch 'default':
Issue27181 add geometric mean.
https://hg.python.org/cpython/rev/9eb5edfcf604

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-07-09 Thread Mark Dickinson

Mark Dickinson added the comment:

> I would like to see them spelled-out:  geometric_mean and harmonic_mean

+1

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-07-08 Thread Raymond Hettinger

Raymond Hettinger added the comment:

I would like to see them spelled-out:  geometric_mean and harmonic_mean

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-07-08 Thread Steven D'Aprano

Steven D'Aprano added the comment:

Does anyone have any strong feeling about the name for these functions?

gmean and hmean;

geometric_mean and harmonic_mean

And "subcontrary_mean" is not an option :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-06-09 Thread Mark Dickinson

Mark Dickinson added the comment:

> Hmm, well, I don't have SciPy installed, but I've found that despite 
> their (well-deserved) reputation, numpy (and presumably scipy) often 
> have rather naive algorithms that can lose accuracy rather 
> spectacularly.

Agreed. And as Ram Rachum hinted, there seems little point aiming to duplicate 
things that already exist in the de facto standard scientific libraries. So I 
think there's a place for a non-naive carefully computed geometric mean in the 
std. lib. statistics module, but I wouldn't see the point of simply adding an 
exp-mean-log implementation (not that anyone is advocating that).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-06-09 Thread Steven D'Aprano

Steven D'Aprano added the comment:

On Thu, Jun 09, 2016 at 09:24:04AM +, Mark Dickinson wrote:

> On the other hand, apparently `exp(mean(log(...)))` is good enough for SciPy:

Hmm, well, I don't have SciPy installed, but I've found that despite 
their (well-deserved) reputation, numpy (and presumably scipy) often 
have rather naive algorithms that can lose accuracy rather 
spectacularly.

py> statistics.mean([1e50, 2e-50, -1e50, 2e-50])
1e-50
py> np.mean(np.array([1e50, 2e-50, -1e50, 2e-50]))
5e-51

py> statistics.mean([1e50, 2e-50, -1e50, 2e-50]*1000)
1e-50
py> np.mean(np.array([1e50, 2e-50, -1e50, 2e-50]*1000))
5.0002e-54

On the other hand, np is probably a hundred times (or more) faster, so I 
suppose accuracy/speed makes a good trade off.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-06-09 Thread Mark Dickinson

Mark Dickinson added the comment:

On the other hand, apparently `exp(mean(log(...)))` is good enough for SciPy: 
its current implementation looks like this:

def gmean(a, axis=0):
a, axis = _chk_asarray(a, axis)
log_a = ma.log(a)
return ma.exp(log_a.mean(axis=axis))

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-06-09 Thread Mark Dickinson

Mark Dickinson added the comment:

Choice of algorithm is a bit tricky here. There are a couple of obvious 
algorithms that work mathematically but result in significant accuracy loss in 
an IEEE 754 floating-point implementation: one is `exp(mean(map(log, 
my_numbers)))`, where the log calls can introduce significant loss of 
information, and the other is `prod(x**(1./len(my_numbers)) for x in 
my_numbers)`, where the `**(1./n)` operation similarly discards information. A 
better algorithm numerically is `prod(my_numbers)**(1./len(my_numbers))`, but 
that's likely to overflow quickly for large datasets (and/or datasets 
containing large values).

I'd suggest something along the lines of 
`prod(my_numbers)**(1./len(my_numbers))`, but keeping track of the exponent of 
the product separately and renormalizing where necessary to avoid overflow.

There are also algorithms for improved accuracy in a product, along the same 
lines as the algorithm used in fsum. See e.g., the paper "Accurate 
Floating-Point Product and Exponentiation" by Stef Graillat. [1] (I didn't know 
about this paper: I just saw a reference to it in a StackOverflow comment [2], 
which reminded me of this issue.)

[1] http://www-pequan.lip6.fr/~graillat/papers/IEEE-TC-prod.pdf
[2] 
http://stackoverflow.com/questions/37715250/safe-computation-of-geometric-mean

--
nosy: +mark.dickinson

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-06-04 Thread Ram Rachum

Ram Rachum added the comment:

And of course, if the goal of the `statistics` module is to be comprehensive, 
one should ask himself what should be the difference between this new module 
and a mature statistics module like `scipy.stats`, and whether we should try to 
copy the features of off `scipy.stats`.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-06-03 Thread Ram Rachum

Ram Rachum added the comment:

To complicate things further...

I implemented a geometric mean on my own, and then I figured out what I really 
want is a *weighted* geometric mean, so I implemented that for myself. If you'd 
want to include that, that'll be cool. Actually I'm not sure if the goal of the 
`statistics` module is to be comprehensive or minimal. I'm hoping it's meant to 
be comprehensive. But then I'd guess there would be a lot of things you'd want 
to add except my little feature.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-06-02 Thread Steven D'Aprano

Steven D'Aprano added the comment:

On Thu, Jun 02, 2016 at 09:04:54PM +, Raymond Hettinger wrote:
> Steven, this seems like a reasonable suggestion (though I would expect 
> someone else will immediately suggest a harmonic mean as well).  Is 
> this within the scope of what you were trying to do with the 
> statistics module?

Yes, I think it is reasonable too. I'll aim to get this in to 3.6.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-06-02 Thread Raymond Hettinger

New submission from Raymond Hettinger:

Steven, this seems like a reasonable suggestion (though I would expect someone 
else will immediately suggest a harmonic mean as well).   Is this within the 
scope of what you were trying to do with the statistics module?

--
assignee:  -> steven.daprano
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-06-02 Thread Xiang Zhang

Changes by Xiang Zhang :


--
nosy: +steven.daprano

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27181] Add geometric mean to `statistics` module

2016-06-02 Thread Ram Rachum

Changes by Ram Rachum :


--
components: Library (Lib)
nosy: cool-RR
priority: normal
severity: normal
status: open
title: Add geometric mean to `statistics` module
type: enhancement
versions: Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com