Re: [Numpy-discussion] Boolean arrays with nulls?

Stefan van der Walt Thu, 18 Apr 2019 09:45:55 -0700

Hi Stuart,

On Thu, 18 Apr 2019 09:12:31 -0700, Stuart Reynolds wrote:
> Is there an efficient way to represent bool arrays with null entries?


You can use the bool dtype:

In [5]: x = np.array([True, False, True])                                       
                                                                                
                     

In [6]: x                                                                       
                                                                                
                     
Out[6]: array([ True, False,  True])

In [7]: x.dtype                                                                 
                                                                                
                     
Out[7]: dtype('bool')

You should note that this stores one True/False value per byte, so it is
not optimal in terms of memory use.  There is no easy way to do
bit-arrays with NumPy, because we use strides to determine how to move
from one memory location to the next.

See also: 
https://www.reddit.com/r/Python/comments/5oatp5/one_bit_data_type_in_numpy/

> What I’m hoping for is that there’s a structure that is ‘viewed’ as
> nan-able float data, but backed but a more efficient structures
> internally.

There are good implementations of this idea, such as:

https://github.com/ilanschnell/bitarray

Those structures cannot typically utilize the NumPy machinery, though.
With the new array function interface, you should at least be able to
build something that has something close to the NumPy API.

Best regards,
Stéfan
_______________________________________________
NumPy-Discussion mailing list
[email protected]
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Boolean arrays with nulls?

Reply via email to