On Thu, Oct 21, 2021 at 2:52 AM Steven D'Aprano <st...@pearwood.info> wrote:

> On Tue, Oct 19, 2021 at 05:09:42PM -0700, Michael Selik wrote:
> > None and its ilk often conflate too many qualities. For example, is it
> > missing because it doesn't exist, it never existed, or because we never
> > received a value, despite knowing it must exist?
>


> 30+ years later, and we cannot easily, reliably or portably use NAN
> payloads. Most people don't care. If we offerred them a dozen or a
> thousand distinct sentinels for all the various kinds of missing data,
> how many people would use them and how many would just stick to plain
> old None?


In data science, I have been frustrated by the sparsity of ways of spelling
"missing value."

Besides the distinction Michael points out, and that Steven did in relation
to NaNs with payloads, I encounter missingness of various other sorts as
well.  Crucially,  an important kind of missing data is data where the
value I received seems unreliable and I have decided to *impute*
missingness rather than accept a value I believe is unreliable.

But there is also something akin to what Michael points out (maybe it's
just an example).  For example, "middle name" is something that some people
simply do not have, other people choose not to provide on a survey, and
others still we just don't know anything beyond "it's not there."

Of course, when I impute missingness, I can do so at various stages of data
cleaning, and for various different reasons or confidences.  None (or NaN)
are sort of OK, but carrying metadata as to the nature of missingness would
be nice.

So my strawman suggestion is tagging None's.  I suppose spellings like
`None[reason]` or `None(reason)` are appealing.

An obvious problem that I recognize is that it's not obvious this can "play
nice" with the common idiom `if mydata is not None: ...`.  None really is a
singleton, and a "tagged singleton" or "annotated singleton" probably
doesn't work well with Python's object model.

My goal, of course, would be to have TaggedNone be a kind of subclass of
None, in the same way that bool is a subclass of int, and hence True is a
kind of 1.  However, I'd want a large number of custom None's, with some
sort of accessible string or numeric code or something to inspect which one
it was.

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HAGJBRBXVWGGU2HRMVEWRNQSGUD75IYT/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to