[issue40855] statistics.stdev ignore xbar argument

2020-06-17 Thread Matti


Matti  added the comment:

I meant to write "pre-calculate".

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-17 Thread Matti


Matti  added the comment:

>I see what you're trying to do but think that interpretation is surprising
>and is at odds with the existing and intended uses of the *xbar* argument.  
>
>The goals were to allow the mean to be precomputed (common case) or to be 
>recentered (uncommon).  Neither case should have the effect of changing the 
>divisor.  
>
>We can't break existing code that assumes that stdev(data) is equal to 
>stdev(data, xbar=mean(data)).

Maybe the requirement are buged? It seems to me that recalculating the mean is 
a very niche use case. You will very little time on a call you do once. 

But what good is it to supply a re-centered mean if you get a wrong estimation 
of the standard deviation? If the mean is not the mean of the sample it was not 
calculated using the sample so there is no loos of degrees of freedom.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-13 Thread Raymond Hettinger

Raymond Hettinger  added the comment:

Thanks for the bug report 

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-13 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset 811e040b6e0241339545c2f055db8259b408802f by Miss Islington (bot) 
in branch '3.8':
bpo-40855: Fix ignored mu and xbar parameters (GH-20835) (GH-20863)
https://github.com/python/cpython/commit/811e040b6e0241339545c2f055db8259b408802f


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-13 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset 55c1d21761e2e5feda5665065ea9e2280fa76113 by Miss Islington (bot) 
in branch '3.9':
bpo-40855: Fix ignored mu and xbar parameters (GH-20835) (#GH-20862)
https://github.com/python/cpython/commit/55c1d21761e2e5feda5665065ea9e2280fa76113


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-13 Thread miss-islington


Change by miss-islington :


--
pull_requests: +20054
pull_request: https://github.com/python/cpython/pull/20863

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-13 Thread miss-islington


Change by miss-islington :


--
nosy: +miss-islington
nosy_count: 3.0 -> 4.0
pull_requests: +20053
pull_request: https://github.com/python/cpython/pull/20862

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-13 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset d71ab4f73887a6e2b380ddbbfe35b600d236fd4a by Raymond Hettinger in 
branch 'master':
bpo-40855: Fix ignored mu and xbar parameters (GH-20835)
https://github.com/python/cpython/commit/d71ab4f73887a6e2b380ddbbfe35b600d236fd4a


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-12 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
keywords: +patch
pull_requests: +20029
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/20835

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-11 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> do you have any comment on my previous answer?

I see what you're trying to do but think that interpretation is surprising
and is at odds with the existing and intended uses of the *xbar* argument.  

The goals were to allow the mean to be precomputed (common case) or to be 
recentered (uncommon).  Neither case should have the effect of changing the 
divisor.  

We can't break existing code that assumes that stdev(data) is equal to 
stdev(data, xbar=mean(data)).

>>> data = [1, 2]
>>> stdev(data)
0.7071067811865476
>>> stdev(data, xbar=mean(data))
0.7071067811865476

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-11 Thread Matti


Matti  added the comment:

Hi Raymond and Steven!

I'm happy that you are solving this issue but do you have any comment on my 
previous answer?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-04 Thread Matti

Matti  added the comment:

If we estimate the mean using a sample we loose one degree of freedom so it 
will be divided by N-1, while if we have the mean independent of the sample it 
should be divided by N to be unbiased. 

i.e. 
example 1 
sqrt(((1-1.5)²+(2-1.5)²)/(2-1)) = 0.7...
example 3
sqrt(((1-1.5)²+(2-1.5)²)/(2)) = 0.5

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-03 Thread Steven D'Aprano


Steven D'Aprano  added the comment:

Thanks Raymond, that is the intended effect, and your analysis seems 
plausible.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-03 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Perhaps this would work:

diff --git a/Lib/statistics.py b/Lib/statistics.py
index c76a6ca519..93a4633464 100644
--- a/Lib/statistics.py
+++ b/Lib/statistics.py
@@ -682,8 +682,10 @@ def _ss(data, c=None):
 calculated from ``c`` as given. Use the second case with care, as it can
 lead to garbage results.
 """
-if c is None:
-c = mean(data)
+if c is not None:
+T, total, count = _sum((x-c)**2 for x in data)
+return (T, total)
+c = mean(data)
 T, total, count = _sum((x-c)**2 for x in data)
 # The following sum should mathematically equal zero, but due to rounding
 # error may not.


Matti, where do you get 0.5 as the expected outcome for the third example?   
The actual mean is 1.5, so I would expect the third case to give sqrt(2)/2 or 
0.707.

--
components: +Library (Lib)
versions: +Python 3.10, Python 3.8, Python 3.9 -Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-03 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

The relevant code is in the _ss() helper function:

# The following sum should mathematically equal zero, but due to rounding
# error may not.
U, total2, count2 = _sum((x-c) for x in data)
assert T == U and count == count2
total -=  total2**2/len(data)

The intent was to correct for small rounding errors, but the effect is to undo 
any xbar value that differs from the true mean.

>From a user point-of-view the xbar parameter should have two effects, saving 
>the computation time for the mean and also giving the ability to recenter the 
>stdev/variance around a different point.   It does save a call to mean; 
>however, that effort is mostly throw-away by the rounding adjustment code 
>which does even more work than computing the mean.

Likely, the fix for this is skip the rounding adjustment code if the user 
supplies an xbar value.

--
assignee:  -> steven.daprano
nosy: +steven.daprano

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-03 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40855] statistics.stdev ignore xbar argument

2020-06-03 Thread Matti


New submission from Matti :

statistics.variance also has the same problem. 

>>> import statistics
>>> statistics.stdev([1,2])
0.7071067811865476
>>> statistics.stdev([1,2], 3)
0.7071067811865476
>>> statistics.stdev([1,2], 1.5)
0.7071067811865476

should be 
0.7071067811865476
2.23606797749979
0.5

--
messages: 370659
nosy: Folket
priority: normal
severity: normal
status: open
title: statistics.stdev ignore xbar argument
type: behavior
versions: Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com