New submission from Raymond Hettinger <[email protected]>:
The current mean() function makes heroic efforts to achieve last bit accuracy
and when possible to retain the data type of the input.
What is needed is an alternative that has a simpler signature, that is much
faster, that is highly accurate without demanding perfection, and that is
usually what people expect mean() is going to do, the same as their calculators
or numpy.mean():
def fmean(seq: Sequence[float]) -> float:
return math.fsum(seq) / len(seq)
On my current 3.8 build, this code given an approx 500x speed-up (almost three
orders of magnitude). Note that having a fast fmean() function is important
in resampling statistics where the mean() is typically called many times:
http://statistics.about.com/od/Applications/a/Example-Of-Bootstrapping.htm
$ ./python.exe -m timeit -r 11 -s 'from random import random' -s 'from
statistics import mean' -s 'seq = [random() for i in range(10_000)]' 'mean(seq)'
50 loops, best of 11: 6.8 msec per loop
$ ./python.exe -m timeit -r 11 -s 'from random import random' -s 'from math
import fsum' -s 'mean=lambda seq: fsum(seq)/len(seq)' -s 'seq = [random() for i
in range(10_000)]' 'mean(seq)'
2000 loops, best of 11: 155 usec per loop
----------
assignee: steven.daprano
components: Library (Lib)
messages: 334894
nosy: rhettinger, steven.daprano, tim.peters
priority: normal
severity: normal
status: open
title: Add statistics.fmean(seq)
type: behavior
versions: Python 3.8
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue35904>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com