Re: [Numpy-discussion] Best dtype for Boolean values

2010-04-12 Thread Robert Kern
On Mon, Apr 12, 2010 at 10:59, John Jack itsmilesda...@gmail.com wrote:
 Hello all.
 I am (relatively) new to python, and 100% new to numpy.
 I need a way to store arrays of booleans and compare the arrays for
 equality.
 I assume I want arrays of dtype Boolean, and I should compare the arrays
 with array_equal
 tmp.all_states
 array([False,  True, False], dtype=bool)
 tmp1.all_states
 array([False, False, False], dtype=bool)
 tmp1.all_states[1]=True
 tmp1.all_states
 array([False,  True, False], dtype=bool)
 array_equal(tmp.all_states,tmp1.all_states)
 True
 any(tmp.all_states)
 True
 Would this be (a) the cheapest way (w.r.t. memory) to store Booleans and

Yes.

 (b)
 the most efficient way to compare two lists of Booleans?

Yes.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Best dtype for Boolean values

2010-04-12 Thread Anne Archibald
On 12 April 2010 11:59, John Jack itsmilesda...@gmail.com wrote:
 Hello all.
 I am (relatively) new to python, and 100% new to numpy.
 I need a way to store arrays of booleans and compare the arrays for
 equality.
 I assume I want arrays of dtype Boolean, and I should compare the arrays
 with array_equal
 tmp.all_states
 array([False,  True, False], dtype=bool)
 tmp1.all_states
 array([False, False, False], dtype=bool)
 tmp1.all_states[1]=True
 tmp1.all_states
 array([False,  True, False], dtype=bool)
 array_equal(tmp.all_states,tmp1.all_states)
 True
 any(tmp.all_states)
 True
 Would this be (a) the cheapest way (w.r.t. memory) to store Booleans and (b)
 the most efficient way to compare two lists of Booleans?

The short answer is yes and yes.

The longer answer is, that uses one byte per Boolean, which is a
tradeoff. In some sense, modern machines are happier working with 32-
or 64-bit quantities, so loading a one-byte Boolean requires a small
amount of byte-shuffling. On the other hand, if you're really short of
memory, 8 bits for a Boolean is really wasteful. In fact, since modern
machines are almost always limited by memory bandwidth, a packed
Boolean data structure would probably be much faster for almost all
operations in spite of the bit-fiddling required. But such a
representation is incompatible with the whole infrastructure of numpy
and so would require a great deal of messy code to support.

So yes, it's the best representation of Booleans available, unless
you're dealing with mind-bogglingly large arrays of them, in which
case some sort of packed-Boolean representation would be better. This
can even be partially supported by numpy, using uint8s, bitwise
operations, and manually-specified bitmasks. There are probably not
many applications for which this is worth the pain.

Anne
P.S. There's actually at least one python package for bit vectors,
outside numpy; I can't speak for how good it is, though. -A

 Thanks for your advice.
 -JJ.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion