I've moved this to python-ideas where it is more appropriate, as Chris
notes

On Thu, Oct 21, 2021, 8:42 PM Chris Angelico <ros...@gmail.com> wrote:

> On Fri, Oct 22, 2021 at 3:23 AM David Mertz, Ph.D.
> <david.me...@gmail.com> wrote:
> >
> > On Thu, Oct 21, 2021 at 2:52 AM Steven D'Aprano <st...@pearwood.info>
> wrote:
> >>
> >> On Tue, Oct 19, 2021 at 05:09:42PM -0700, Michael Selik wrote:
> >> > None and its ilk often conflate too many qualities. For example, is it
> >> > missing because it doesn't exist, it never existed, or because we
> never
> >> > received a value, despite knowing it must exist?
> >
> >
> >>
> >> 30+ years later, and we cannot easily, reliably or portably use NAN
> >> payloads. Most people don't care. If we offerred them a dozen or a
> >> thousand distinct sentinels for all the various kinds of missing data,
> >> how many people would use them and how many would just stick to plain
> >> old None?
> >
> >
> > In data science, I have been frustrated by the sparsity of ways of
> spelling "missing value."
>
> Might be worth redirecting this to -ideas.
>
> > Besides the distinction Michael points out, and that Steven did in
> relation to NaNs with payloads, I encounter missingness of various other
> sorts as well.  Crucially,  an important kind of missing data is data where
> the value I received seems unreliable and I have decided to *impute*
> missingness rather than accept a value I believe is unreliable.
> >
> > But there is also something akin to what Michael points out (maybe it's
> just an example).  For example, "middle name" is something that some people
> simply do not have, other people choose not to provide on a survey, and
> others still we just don't know anything beyond "it's not there."
> >
>
> And some people have more than one (I have a brother with two of
> them). Not the best example to use, since names have WAY more
> complexities than different types of absence, but there are other
> cases where that sort of thing comes up. For instance, if someone says
> on a survey that s/he is in Australia, and then you ask for a
> postcode, then leaving it blank should be recorded as "chose not to
> provide"; but if the country is listed as Timor-Leste / East Timor,
> then "not applicable" would be appropriate, since the country doesn't
> use postal codes.
>
> > Of course, when I impute missingness, I can do so at various stages of
> data cleaning, and for various different reasons or confidences.  None (or
> NaN) are sort of OK, but carrying metadata as to the nature of missingness
> would be nice.
> >
>
> Right. Using postcodes as an example again, for someone in Australia,
> a postcode of "E3B 0H8" doesn't make sense, as that isn't the format
> we use. So you could wipe that out and replace it with "No postal
> code, malformed data entered".
>
> > So my strawman suggestion is tagging None's.  I suppose spellings like
> `None[reason]` or `None(reason)` are appealing.
> >
> > An obvious problem that I recognize is that it's not obvious this can
> "play nice" with the common idiom `if mydata is not None: ...`.  None
> really is a singleton, and a "tagged singleton" or "annotated singleton"
> probably doesn't work well with Python's object model.
> >
> > My goal, of course, would be to have TaggedNone be a kind of subclass of
> None, in the same way that bool is a subclass of int, and hence True is a
> kind of 1.  However, I'd want a large number of custom None's, with some
> sort of accessible string or numeric code or something to inspect which one
> it was.
> >
>
> But this is where I start to disagree. None should remain a singleton,
> but "no data available" could be its own thing, tied in with the way
> that you do your data storage and stats. As such, you wouldn't be
> checking it with 'is', so you wouldn't have that problem (the Python
> 'is' operator will only ever test for actual object identity).
>
> Keep None simple and dependable, and then "Missing Data" can be an
> entire class of values if you so desire.
>
> ChrisA
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/6NY5NQCJR3ROFBWWFOVD47HJFBQJC3IZ/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CFNI2QLBJ5D3YOQZ2TSZHZHQCPXCGAUN/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to