[Python-ideas] Re: NAN handling in statistics functions

2021-08-28 Thread David Mertz, Ph.D.
On Sat, Aug 28, 2021, 8:34 AM Stephen J. Turnbull < stephenjturnb...@gmail.com> wrote: > David Mertz, Ph.D. writes: > > > NANs do not necessarily represent missing data. > > > I think in the context of `stats` they do. But this is color of > bikeshed, and I defer to you, of course. > > I have a

[Python-ideas] Re: NAN handling in statistics functions

2021-08-28 Thread Marc-Andre Lemburg
On 28.08.2021 14:33, Richard Damon wrote: > On 8/28/21 6:23 AM, Marc-Andre Lemburg wrote: >> To me, the behavior looked a lot like stripping NANs left and right >> from the list, but what you're explaining makes this appear even more >> as a bug in the implementation of median() - basically wrong

[Python-ideas] Re: NAN handling in statistics functions

2021-08-28 Thread Richard Damon
On 8/28/21 6:23 AM, Marc-Andre Lemburg wrote: > To me, the behavior looked a lot like stripping NANs left and right > from the list, but what you're explaining makes this appear even more > as a bug in the implementation of median() - basically wrong assumptions > about NANs sorting correctly.

[Python-ideas] Re: NAN handling in statistics functions

2021-08-28 Thread Marc-Andre Lemburg
On 28.08.2021 05:32, Steven D'Aprano wrote: > On Thu, Aug 26, 2021 at 09:36:27AM +0200, Marc-Andre Lemburg wrote: > >> Indeed. The NAN handling in median() looks like a bug, more than >> anything else: > > [slightly paraphrased] > l1 = [1,2,nan,4] > l2 = [nan,1,2,4] >> >

[Python-ideas] Re: NAN handling in statistics functions

2021-08-28 Thread Marc-Andre Lemburg
On 28.08.2021 07:14, Christopher Barker wrote: > > SciPy should probably also be a data-point, it uses: > >     nan_policy : {'propagate', 'raise', 'omit'}, optional > > > +1 > > Also +1 on a string flag, rather than an Enum. Same here. Codecs use strings as well: 'strict',

[Python-ideas] Re: NAN handling in statistics functions

2021-08-28 Thread David Mertz, Ph.D.
On Sat, Aug 28, 2021, 1:58 AM Steven D'Aprano wrote: > On Sat, Aug 28, 2021 at 01:36:33AM -0400, David Mertz, Ph.D. wrote: > > > I like the statsmodels spelling better: missing : str; Available options > > are ‘none’, ‘drop’, and ‘raise’ > > NANs do not necessarily represent missing data. > I