[Numpy-discussion] Re: np.bool_ vs Python bool behavior

Charles R Harris Sun, 13 Mar 2022 09:53:11 -0700

On Sun, Mar 13, 2022 at 10:31 AM Charles R Harris <charlesr.har...@gmail.com>
wrote:


>
>
> On Sat, Mar 12, 2022 at 4:53 PM Jacob Reinhold <jcreinh...@gmail.com>
> wrote:
>
>> A pain point I ran into a while ago was assuming that an np.ndarray with
>> dtype=np.bool_ would act similarly to the Python built-in boolean under
>> addition. This is not the case, as shown in the following code snippet:
>>
>> >>> np.bool_(True) + True
>> True
>> >>> True + True
>> 2
>>
>> In fact, I'm somewhat confused about all the arithmetic operations on
>> boolean arrays:
>>
>> >>> np.bool_(True) * True
>> True
>> >>> np.bool_(True) / True
>> 1.0
>> >>> np.bool_(True) - True
>> TypeError: numpy boolean subtract, the `-` operator, is not supported,
>> use the bitwise_xor, the `^` operator, or the logical_xor function instead.
>> >>> for x, y in ((False, False), (False, True), (True, False), (True,
>> True)): print(np.bool_(x) ** y, end=" ")
>> 1 0 1 1
>>
>> I get that addition corresponds to "logical or" and multiplication
>> corresponds to "logical and", but I'm lost on the division and
>> exponentiation operations given that addition and multiplication don't
>> promote the dtype to integers or floats.
>>
>> If arrays stubbornly refused to ever change type or interact with objects
>> of a different type under addition, that'd be one thing, but they do change:
>>
>> >>> np.uint8(0) - 1
>> -1
>> >>> (np.uint8(0) - 1).dtype
>> dtype('int64')
>> >>> (np.uint8(0) + 0.1).dtype
>> dtype('float64')
>>
>> This dtype change can also be seen in the division and exponentiation
>> above for np.bool_.
>>
>> Why the discrepancy in behavior for np.bool_? And why are arithmetic
>> operations for np.bool_ inconsistently promoted to other data types?
>>
>> If all arithmetic operations on np.bool_ resulted in integers, that would
>> be consistent (so easier to work with) and wouldn't restrict expressiveness
>> because there are also "logical or" (|) and "logical and" (&) operations
>> available. Alternatively, division and exponentiation could throw errors
>> like subtract, but the discrepancy between np.bool_ and the Python built-in
>> bool for addition and multiplication would remain.
>>
>> For context, I ran into an issue with this discrepancy in behavior while
>> working on an image segmentation problem. For binary segmentation problems,
>> we make use of boolean arrays to represent where an object is (the
>> locations in the array which are "True" correspond to the
>> foreground/object-of-interest, "False" corresponds to the background). I
>> was aggregating multiple binary segmentation arrays to do a majority vote
>> with an implementation that boiled down to the following:
>>
>> >>> pred1, pred2, ..., predN = np.array(..., dtype=np.bool_),
>> np.array(..., dtype=np.bool_), ..., np.array(..., dtype=np.bool_)
>> >>> aggregate = (pred1 + pred2 + ... + predN) / N
>> >>> agg_pred = aggregate >= 0.5
>>
>> Which returned (1.0 / N) in all indices which had at least one "True"
>> value in a prediction. I assumed that the arrays would be promoted to
>> integers (False -> 0; True -> 1) and added so that agg_pred would hold the
>> majority vote result. But agg_pred was always empty because the maximum
>> value was (1.0 / N) for N > 2.
>>
>> My current "work around" is to remind myself of this discrepancy by
>> importing "builtins" from the standard library and annotating the relevant
>> functions and variables as using the "builtins.bool" to explicitly
>> distinguish it from np.bool_ behavior where applicable, and add checks
>> and/or conversions on top of that. But why not make np.bool_ act like the
>> built-in bool under addition and multiplication  and let users use the
>> already existing | and & operations for "logical or" and "logical and"?
>>
>
> NumPy bool_ is a type and is only represented by the values (0, 1) with
> the "+" and "*' operators overloaded to be "or". The later Python bool is
> pretty much just an integer, as that was backward compatible. So you end up
> with things like
>
> In [20]: type(np.bool_(1) + np.bool_(1))  # "+" is the "or" operator
> Out[20]: np.bool_
>
> In [21]: type(bool(1) + bool(1))  # "+" is integer addition
> Out[21]: int
>
> In [22]: type(np.bool_(1) * np.bool_(1))  # "*" is the "and" operator
> Out[22]: np.bool_
>
> In [23]: type(bool(1) + bool(1))  # "*" is integer multiplication
> Out[23]: int
>
> Numpy bool_ will be promoted to int when combined with Python ints.
>
>
The non-logical operators convert np.bool_ to numbers with the exception of
"-", which also used to be overloaded as a logical operator. We raised an
error when we changed that so that people could adjust their code and use
"^" instead. Long term it might make sense to reintroduce "-" with integer
promotion.

Chuck

_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: np.bool_ vs Python bool behavior

Reply via email to