[
https://issues.apache.org/jira/browse/ARROW-12695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346046#comment-17346046
]
Joris Van den Bossche edited comment on ARROW-12695 at 5/17/21, 10:17 AM:
--------------------------------------------------------------------------
Currently pyarrow doesn't implement any {{\_\_bool\_\_}}. In general, Python
will then always return True by default, but it seems that if your object is
"sequence-like" (having a {{\_\_len\_\_}}), it will check the length. This is
described at https://docs.python.org/3/library/stdtypes.html#truth-value-testing
So here the underlying reason is that this fails:
{code}
>>> len(pa.scalar([1, 2], type=pa.list_(pa.int32())))
2
>>> len(pa.scalar(None, type=pa.list_(pa.int32())))
...
TypeError: object of type 'NoneType' has no len()
{code}
But the question is also, what should this return instead? Returning 0 in this
case also doesn't feel correct, as you can also have an empty list scalar with
a length of zero.
In general, I think it will be hard to give a nice and consistent interface for
pyarrow scalars involving null scalars (we could provide better error messages
though?)
[~mosalx] what's your use case for wanting to do {{bool(null_scalar)}}, and
what do you think it should return? (also True as the other scalars?)
was (Author: jorisvandenbossche):
Currently pyarrow doesn't implement any {{\_\_bool\_\_}}. In general, Python
will then always return True by default, but it seems that if your object is
"sequence-like" (having a {\_\_len\_\_}}), it will check the length. This is
described at https://docs.python.org/3/library/stdtypes.html#truth-value-testing
So here the underlying reason is that this fails:
{code}
>>> len(pa.scalar([1, 2], type=pa.list_(pa.int32())))
2
>>> len(pa.scalar(None, type=pa.list_(pa.int32())))
...
TypeError: object of type 'NoneType' has no len()
{code}
But the question is also, what should this return instead? Returning 0 in this
case also doesn't feel correct, as you can also have an empty list scalar with
a length of zero.
In general, I think it will be hard to give a nice and consistent interface for
pyarrow scalars involving null scalars (we could provide better error messages
though?)
[~mosalx] what's your use case for wanting to do {{bool(null_scalar)}}, and
what do you think it should return? (also True as the other scalars?)
> [Python] bool value of scalars depends on data type
> ---------------------------------------------------
>
> Key: ARROW-12695
> URL: https://issues.apache.org/jira/browse/ARROW-12695
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 4.0.0
> Environment: Windows 10
> python 3.9.4
> Reporter: Sergey Mozharov
> Priority: Major
>
> `pyarrow.Scalar` and its subclasses do not implement `__bool__` method. The
> default implementation does not seem to do the right thing. For example:
> {code:java}
> >>> import pyarrow as pa
> >>> na_value = pa.scalar(None, type=pa.int32())
> >>> bool(na_value)
> True
> >>> na_value = pa.scalar(None, type=pa.struct([('a', pa.int32())]))
> >>> bool(na_value)
> False
> >>> bool(pa.scalar(None, type=pa.list_(pa.int32())))
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "pyarrow\scalar.pxi", line 572, in pyarrow.lib.ListScalar.__len__
> TypeError: object of type 'NoneType' has no len()
> >>>
> {code}
> Please consider implementing `___bool____` method. It seems reasonable to
> delegate to the `____bool___` method of the wrapped object.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)