Note if the snippet below doesn't display right in your e-mail reader,
you can read it here:
https://gist.github.com/pitrou/6a0ce89ce866bc0c70e33155503d1c47
Le 01/07/2020 à 09:46, Antoine Pitrou a écrit :
>
> Hello,
>
> Recent changes to PyArrow seem to have taken the stance that comparing
> null values should return null. The problem is that it breaks the
> expectation that comparisons should return booleans, and perculates into
> crazy behaviour in other places. Here is an example of such
> misbehaviour in the scalar refactor PR:
>
>>>> import pyarrow as pa
>
>
>>>> na = pa.scalar(None)
>
>
>>>> na == na
>
>
> <pyarrow.NullScalar: None>
>>>> na == 5
>
>
> <pyarrow.NullScalar: None>
>>>> bool(na == 5)
>
>
> True
>>>> if na == 5: print("yo!")
>
>
> yo!
>>>> na in [5]
>
>
> True
>
> But you can see it also with arrays containing null values:
>
>>>> pa.array([1, None]) in [pa.scalar(42)]
>
>
> True
>
> I think that Python equality operators should behave in a
> Python-sensible way (return True or False). Have people call another
> method if they like the fancy (or noxious, depending on the POV)
> semantics of returning null when comparing null with anything.
>
> (note that Numpy doesn't have null scalars, so it can be less
> conservative in its customization of equality methods)
>
> Regards
>
> Antoine.
>