jorisvandenbossche commented on issue #34987:
URL: https://github.com/apache/arrow/issues/34987#issuecomment-1501623892
While I certainly agree the current behaviour is not ideal, I am a bit
hesitant to just add `__bool__` to the boolean scalars.
- Several broader questions for `__bool__` itself:
- If we add it to BooleanScalar, we should probably add it to all other
scalars as well?
- If we add it to other scalar types, what should be its behaviour? For
other numeric types it probably makes sense to follow 0 being falsey. But
personally I am not fully convinced we should start making ListScalars
truthy/falsey depending on the length of the list scalar value.
- As @randolf-scholz already mentioned, we then also need to make a
decision about the truthy value of null.
- But aside from `bool(..)` there are other aspects that people would then
expect from normal "scalar" values, such as equality. Currently `__eq__` is
very dumb, so doing `pa.scalar(None) == pa.scalar(None)` gives True (while that
doesn't match element-wise equality behavior for the array with equivalent
values) or it requires exact types (again not being consistent with the
element-wise "equal" kernel)
We could also say: if you want python-like scalar behaviour, just call
`as_py()` on the pyarrow scalar.
(and we could also raise an error in `__bool__` always, to make it clear
this method isn't actually implemented)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]