Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-08 Thread josef . pktd
On Fri, Dec 6, 2013 at 11:25 PM, Charles R Harris
charlesr.har...@gmail.com wrote:



 On Fri, Dec 6, 2013 at 2:14 PM, josef.p...@gmail.com wrote:

 On Fri, Dec 6, 2013 at 3:50 PM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
  On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote:
  On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote:
   On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky
   ndar...@mac.com wrote:
  
  
  
   On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com
   wrote:
  
   On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
unary versus binary minus
  
   Oh right; I consider binary `-` broken for
   Boolean arrays. (Sorry Alexander; I did not
   see your entire issue.)
  
  
I'd rather write ~ than unary - if that's what it is.
  
   I agree.  So I have no objection to elimination
   of the `-`.
  
  
   It looks like we are close to reaching a consensus on the following
   points:
  
   1. * is well-defined on boolean arrays and may be used in preference
   of  in
   code that is designed to handle 1s and 0s of any dtype in addition
   to
   booleans.
  
   2. + is defined consistently with * and the only issue is the
   absence of
   additive inverse.  This is not a problem as long as presence of -
   does not
   suggest otherwise.
  
   3. binary and unary minus should be deprecated because its use in
   expressions where variables can be either boolean or numeric would
   lead to
   subtle bugs.  For example -x*y would produce different results from
   -(x*y)
   depending on whether x is boolean or not.  In all situations, ^ is
   preferable to binary - and ~ is preferable to unary -.
  
   4. changing boolean arithmetics to auto-promotion to int is
   precluded by a
   significant use-case of boolean matrices.
  
   +1
 
  +0.5
  (I would still prefer a different binary minus, but it would be
  inconsistent with a logical unary minus that negates.)
 
 
  The question is if the current xor behaviour can make sense? It doesn't
  seem to make much sense mathematically? Which only leaves that `abs(x -
  y)` is actually what a (python) programmer might expect.
  I think I would like to deprecate at least the unary one. The ~ kind of
  behaviour just doesn't fit as far as I can see.

 I haven't seen any real use cases for xor yet.


 Using it instead of '+' yields a boolean ring instead of semi-ring. Papers
 from the first quarter of the last century used it pretty often on that
 account, hence 'sigma-rings', etc. Eventually the simplicity of the
 inclusive or overcame that tendency.

 My impression is that both plus and minus are just overflow accidents
 and not intentional. plus works in a useful way, minus as xor might be
 used once per century.


 It's certainly weird given that '+' means the inclusive or. I think '^' is
 much preferable.
 Although it makes some sense if one can keep the semantics straight.
 Complicated, though.

I'm looking at the test failure with allclose

Looks like - as xor still makes sense in some cases, because it
doesn't need special cases for equality checks for example.
x - y == 0 iff x == y

What happens to np.diff?

 np.diff(m1)
array([False,  True, False], dtype=bool)

I'm using code like that to get the length of runs in a runstest.

But in my current code, I actually have astype(int) and also use
floats later on.
If I read my (incompletely documented) code correctly, I needed also
the sign not just the run length.

Just another argument what minus could be.

Josef



 I would deprecate both unary and binary minus.

 (And when nobody is looking in two versions from now, I would add a
 binary minus that overflows to the clipped version, so I get a set
 subtraction. :)


 Where is '\' when you need it?

 snip

 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Nathaniel Smith
On Thu, Dec 5, 2013 at 7:33 PM,  josef.p...@gmail.com wrote:
 On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 Hey,

 there was a discussion that for numpy booleans math operators +,-,* (and
 the unary -), while defined, are not very helpful. I have set up a quick
 PR with start (needs some fixes inside numpy still):

 https://github.com/numpy/numpy/pull/4105

 The idea is to deprecate these, since the binary operators |,^,| (and
 the unary ~ even if it is weird) behave identical. This would not affect
 sums of boolean arrays. For the moment I saw one annoying change in
 numpy, and that is `abs(x - y)` being used for allclose and working
 nicely currently.

 I like mask = mask1 * mask2

 That's what I learned working my way through scipy.stats.distributions
 a long time ago.

* is least problematic case, since there numpy and python bools
already almost agree. (They return the same values, but numpy returns
a bool array instead of an integer array.)

On Thu, Dec 5, 2013 at 8:05 PM, Alan G Isaac alan.is...@gmail.com wrote:
 For + and * (and thus `dot`), this will fix something that is not broken.
 It is in fact in conformance with a large literature on boolean arrays
 and boolean matrices.

Interesting point! I had assumed that dot() just upcast! But what do
you think about the inconsistency between sum() and dot() on bool
arrays?

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Sebastian Berg
On Thu, 2013-12-05 at 23:02 -0500, josef.p...@gmail.com wrote:
 On Thu, Dec 5, 2013 at 10:56 PM, Alexander Belopolsky ndar...@mac.com wrote:
  On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net
  wrote:
  there was a discussion that for numpy booleans math operators +,-,* (and
  the unary -), while defined, are not very helpful.
 
  It has been suggested at the Github that there is an area where it is useful
  to have linear algebra operations like matrix multiplication to be defined
  over a semiring:
 
  http://en.wikipedia.org/wiki/Logical_matrix
 
  This still does not justify having unary or binary -, so I suggest that we
  first discuss deprecation of those.
 
 Does it make sense to only remove - and maybe / ?
 
 would python sum still work?   (I almost never use it.)
 
  sum(mask)
 2
  sum(mask.tolist())
 2
 
 is accumulate the same as sum and would keep working?
 
  np.add.accumulate(mask)
 array([0, 0, 0, 1, 2])
 
 
 In operation with other dtypes, do they still dominate so these work?
 

Hey,

of course the other types will always dominate interpreting bools as 0
and 1. This would only affect operations with only booleans. There is a
good point that * is well defined however you define it, though. (Btw. /
is not defined for bools, `np.bool_(True)/np.bool_(True)` will upcast to
int8 to do the operation)

However, while well defined, + is not defined like it is for python
bools (which are just ints) so that is the reason to consider
deprecation there (if we allow upcast to int8 -- or maybe the default
int -- in the future, in-place += and -= operations would not behave
differently, since they just cast back...).

I suppose python sum works because it first tries using the C-Api number
protocol, which also means it is not affected. If you were to write a
sum which just uses the `+` operator, it would be affected, but that
would seem good to me.

- Sebastian


  x / mask
 array([0, 0, 0, 3, 4])
  x * 1. / mask
 array([ nan,  inf,  inf,   3.,   4.])
  x**mask
 array([1, 1, 1, 3, 4])
  mask - 5
 array([-5, -5, -5, -4, -4])
 
 Josef
 
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread josef . pktd
On Fri, Dec 6, 2013 at 4:39 AM, Sebastian Berg
sebast...@sipsolutions.net wrote:
 On Thu, 2013-12-05 at 23:02 -0500, josef.p...@gmail.com wrote:
 On Thu, Dec 5, 2013 at 10:56 PM, Alexander Belopolsky ndar...@mac.com 
 wrote:
  On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net
  wrote:
  there was a discussion that for numpy booleans math operators +,-,* (and
  the unary -), while defined, are not very helpful.
 
  It has been suggested at the Github that there is an area where it is 
  useful
  to have linear algebra operations like matrix multiplication to be defined
  over a semiring:
 
  http://en.wikipedia.org/wiki/Logical_matrix
 
  This still does not justify having unary or binary -, so I suggest that we
  first discuss deprecation of those.

 Does it make sense to only remove - and maybe / ?

 would python sum still work?   (I almost never use it.)

  sum(mask)
 2
  sum(mask.tolist())
 2

 is accumulate the same as sum and would keep working?

  np.add.accumulate(mask)
 array([0, 0, 0, 1, 2])


 In operation with other dtypes, do they still dominate so these work?


 Hey,


In statistics and econometrics (and economic theory) we just use an
indicator function 1_{x=5} which has largely the same properties as a
numpy bool array, at least in my code.

some of the common operations are *, dot and kron.

So far this has worked quite well as intuition, plus numpy casting rules.

dot is the main surprise, because I thought that it would upcast. (I
always think of dot as a np.linalg.)



 of course the other types will always dominate interpreting bools as 0
 and 1. This would only affect operations with only booleans.

My guess is that this would leave then 90% of our (statsmodels)
possible usage alone.

There is still the case that with * we can calculate the intersection.


There is a
 good point that * is well defined however you define it, though. (Btw. /
 is not defined for bools, `np.bool_(True)/np.bool_(True)` will upcast to
 int8 to do the operation)

 However, while well defined, + is not defined like it is for python
 bools (which are just ints) so that is the reason to consider
 deprecation there (if we allow upcast to int8 -- or maybe the default
 int -- in the future, in-place += and -= operations would not behave
 differently, since they just cast back...).

Actually, I used + once:

The calculation in terms of indicator functions is

1_{A} + 1_{B} - 1_{A  B}

The last part avoids double counting, which is not necessary if numpy
casts back to bool.
Nothing that couldn't be replaced by logical operators, but the
(linear) algebra is not logical.

In this case I did care about memory because the arrays are (nobs,
nobs) (nobs is the number of observations shape[0]) which can be
large, and I have a sparse version also. In most other case we use
astype(int) already very often, because eventually we still have to
cast and memory won't be a big problem.

The mental model is set membership and set operations with indicator
functions, not logical, and I don't remember running into problems
with this so far, and happily ignored logical_xxx when I do linear
algebra instead of just working with masks of booleans.

Nevertheless: If I'm forced to, then I will get used to logical_xxx. (*)
And the above bool addition hasn't made it into statsmodels yet. I
used a simpler version because I thought initially it's too cute. (And
I was using an older numpy that couldn't do broadcasted dot.)

(*) how do you search in the documentation of `` or `|`, I cannot
find what the other symbols are, if there are any.


 I suppose python sum works because it first tries using the C-Api number
 protocol, which also means it is not affected. If you were to write a
 sum which just uses the `+` operator, it would be affected, but that
 would seem good to me.

based on the ticket example, I'm not sure whether `+` should upcast or not.

 mm.dtype
dtype('bool')
 mm.sum(0)
array([48, 45, 56, 47])

 mm.sum(0, bool)
array([ True,  True,  True,  True], dtype=bool)
I would just use any

but what happens with logical cumsum

 mm[:5].cumsum(0, bool)
array([[False,  True,  True,  True],
   [ True,  True,  True,  True],
   [ True,  True,  True,  True],
   [ True,  True,  True,  True],
   [ True,  True,  True,  True]], dtype=bool)

same as mm[:5].astype(int).cumsum(0, bool)  without casting

Josef



 - Sebastian


  x / mask
 array([0, 0, 0, 3, 4])
  x * 1. / mask
 array([ nan,  inf,  inf,   3.,   4.])
  x**mask
 array([1, 1, 1, 3, 4])
  mask - 5
 array([-5, -5, -5, -4, -4])

 Josef

 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 

Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Alan G Isaac
 On Thu, Dec 5, 2013 at 8:05 PM, Alan G Isaac
 alan.is...@gmail.com wrote:
 For + and * (and thus `dot`), this will fix something that is not broken.
 It is in fact in conformance with a large literature on boolean arrays
 and boolean matrices.

On 12/6/2013 3:24 AM, Nathaniel Smith wrote:
 Interesting point! I had assumed that dot() just upcast! But what do
 you think about the inconsistency between sum() and dot() on bool
 arrays?


I don't like the behavior of sum on bool arrays.
(I.e., automatic upcasting.)
But I do not suggest changing it,
as much code is likely to depend on it.

Cheers,
Alan

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread josef . pktd
On Fri, Dec 6, 2013 at 9:32 AM,  josef.p...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 4:39 AM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 On Thu, 2013-12-05 at 23:02 -0500, josef.p...@gmail.com wrote:
 On Thu, Dec 5, 2013 at 10:56 PM, Alexander Belopolsky ndar...@mac.com 
 wrote:
  On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg 
  sebast...@sipsolutions.net
  wrote:
  there was a discussion that for numpy booleans math operators +,-,* (and
  the unary -), while defined, are not very helpful.
 
  It has been suggested at the Github that there is an area where it is 
  useful
  to have linear algebra operations like matrix multiplication to be defined
  over a semiring:
 
  http://en.wikipedia.org/wiki/Logical_matrix
 
  This still does not justify having unary or binary -, so I suggest that we
  first discuss deprecation of those.

 Does it make sense to only remove - and maybe / ?

 would python sum still work?   (I almost never use it.)

  sum(mask)
 2
  sum(mask.tolist())
 2

 is accumulate the same as sum and would keep working?

  np.add.accumulate(mask)
 array([0, 0, 0, 1, 2])


 In operation with other dtypes, do they still dominate so these work?


 Hey,


 In statistics and econometrics (and economic theory) we just use an
 indicator function 1_{x=5} which has largely the same properties as a
 numpy bool array, at least in my code.

 some of the common operations are *, dot and kron.

 So far this has worked quite well as intuition, plus numpy casting rules.

 dot is the main surprise, because I thought that it would upcast. (I
 always think of dot as a np.linalg.)



 of course the other types will always dominate interpreting bools as 0
 and 1. This would only affect operations with only booleans.

 My guess is that this would leave then 90% of our (statsmodels)
 possible usage alone.

 There is still the case that with * we can calculate the intersection.


 There is a
 good point that * is well defined however you define it, though. (Btw. /
 is not defined for bools, `np.bool_(True)/np.bool_(True)` will upcast to
 int8 to do the operation)

 However, while well defined, + is not defined like it is for python
 bools (which are just ints) so that is the reason to consider
 deprecation there (if we allow upcast to int8 -- or maybe the default
 int -- in the future, in-place += and -= operations would not behave
 differently, since they just cast back...).

 Actually, I used + once:

 The calculation in terms of indicator functions is

 1_{A} + 1_{B} - 1_{A  B}

 The last part avoids double counting, which is not necessary if numpy
 casts back to bool.
 Nothing that couldn't be replaced by logical operators, but the
 (linear) algebra is not logical.

 In this case I did care about memory because the arrays are (nobs,
 nobs) (nobs is the number of observations shape[0]) which can be
 large, and I have a sparse version also. In most other case we use
 astype(int) already very often, because eventually we still have to
 cast and memory won't be a big problem.

 The mental model is set membership and set operations with indicator
 functions, not logical, and I don't remember running into problems
 with this so far, and happily ignored logical_xxx when I do linear
 algebra instead of just working with masks of booleans.

http://en.wikipedia.org/wiki/Indicator_function
with the added advantage that we have also the version where +
constrains to (0, 1).
However `-` doesn't work properly because
 np.bool_(-5)
True
instead of False
except in the case `1 - mask`.

We really have two kinds of addition:

bool sum: for indicating set membership
counting sum: for counting number of elements.

from my viewpoint:

I would keep + and * since they work well (bool + and count +)
minus - is partially broken and `/` looks useless

this casts anyway
 1 - m1
array([1, 1, 0, 0, 0])

and I never thought of doing this
 True - m1
array([ True,  True, False, False, False], dtype=bool)

(python set defines minus but raises error on plus)

Josef


 Nevertheless: If I'm forced to, then I will get used to logical_xxx. (*)
 And the above bool addition hasn't made it into statsmodels yet. I
 used a simpler version because I thought initially it's too cute. (And
 I was using an older numpy that couldn't do broadcasted dot.)

 (*) how do you search in the documentation of `` or `|`, I cannot
 find what the other symbols are, if there are any.


 I suppose python sum works because it first tries using the C-Api number
 protocol, which also means it is not affected. If you were to write a
 sum which just uses the `+` operator, it would be affected, but that
 would seem good to me.

 based on the ticket example, I'm not sure whether `+` should upcast or not.

 mm.dtype
 dtype('bool')
 mm.sum(0)
 array([48, 45, 56, 47])

 mm.sum(0, bool)
 array([ True,  True,  True,  True], dtype=bool)
 I would just use any

 but what happens with logical cumsum

 mm[:5].cumsum(0, bool)
 array([[False,  True,  True,  True],
[ True,  

Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Alan G Isaac
On 12/5/2013 11:14 PM, Alexander Belopolsky wrote:
 did you find minus to be as useful?


It is also a correct usage.

I think a good approach to this is to first realize that
there were good reasons for the current behavior.

Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread josef . pktd
On Fri, Dec 6, 2013 at 11:13 AM, Alan G Isaac alan.is...@gmail.com wrote:
 On 12/5/2013 11:14 PM, Alexander Belopolsky wrote:
 did you find minus to be as useful?


 It is also a correct usage.

 I think a good approach to this is to first realize that
 there were good reasons for the current behavior.

What's the meaning of minus?
I cannot make much sense out of it, or come up with any use case.

Josef


 Alan Isaac

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Alexander Belopolsky
On Fri, Dec 6, 2013 at 11:13 AM, Alan G Isaac alan.is...@gmail.com wrote:

 On 12/5/2013 11:14 PM, Alexander Belopolsky wrote:
  did you find minus to be as useful?


 It is also a correct usage.


Can you provide a reference?



 I think a good approach to this is to first realize that
 there were good reasons for the current behavior.


Maybe there were, in which case the current behavior should be documented
somewhere.

What is the rationale for this:

 -array(True) + array(True)
True

?

I am not aware of any algebraic system where unary minus denotes anything
other than additive inverse.

Having bools form a semiring under + and * is a fine (yet somewhat unusual)
choice, but once you've made that choice you loose subtraction because True
+ x = True no longer has a unique solution.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread josef . pktd
On Fri, Dec 6, 2013 at 12:23 PM, Alexander Belopolsky ndar...@mac.com wrote:
 On Fri, Dec 6, 2013 at 11:13 AM, Alan G Isaac alan.is...@gmail.com wrote:

 On 12/5/2013 11:14 PM, Alexander Belopolsky wrote:
  did you find minus to be as useful?


 It is also a correct usage.


 Can you provide a reference?



 I think a good approach to this is to first realize that
 there were good reasons for the current behavior.


 Maybe there were, in which case the current behavior should be documented
 somewhere.

 What is the rationale for this:

 -array(True) + array(True)
 True

 ?

 I am not aware of any algebraic system where unary minus denotes anything
 other than additive inverse.

I would be perfectly happy if numpy would cast (negative) overflow to
the smallest value, instead of wrapping around.
The same is true for integers.

 np.array(0, np.int8) - np.array(-128, np.int8)
-128
 - np.array(-128, np.int8)
-128

Josef
It's consistent. But does it make sense?

 Having bools form a semiring under + and * is a fine (yet somewhat unusual)
 choice, but once you've made that choice you loose subtraction because True
 + x = True no longer has a unique solution.

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Alan G Isaac
On 12/6/2013 12:23 PM, Alexander Belopolsky wrote:
 What is the rationale for this:

   -array(True) + array(True)
 True


The minus is complementation.
So you are just writing
False or True

Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Alan G Isaac
 On 12/5/2013 11:14 PM, Alexander Belopolsky wrote:
 did you find minus to be as useful?

 On Fri, Dec 6, 2013 at 11:13 AM, Alan G Isaac
 It is also a correct usage.

On 12/6/2013 12:23 PM, Alexander Belopolsky wrote:
 Can you provide a reference?


For use of the minus sign, I don't have one at hand, but
a quick Google seach comes up with:
http://www.csee.umbc.edu/~artola/fall02/BooleanAlgebra.ppt
It is more common to use a superscript `c`, but that's
just a notational issue.

For multiplication, addition, and dot,
you can see Ki Hang Kim's Boolean matrix
Theory and Applications.

Applications are endless and include graph theory
and (then naturally) circuit design.

Cheers,
Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread josef . pktd
On Fri, Dec 6, 2013 at 1:16 PM, Alan G Isaac alan.is...@gmail.com wrote:
 On 12/6/2013 12:23 PM, Alexander Belopolsky wrote:
 What is the rationale for this:

   -array(True) + array(True)
 True


 The minus is complementation.
 So you are just writing
 False or True

unary versus binary minus

 m1 + (-m2)
array([False, False,  True,  True,  True], dtype=bool)
 m1 - m2
array([ True,  True, False, False,  True], dtype=bool)

 -m2 + m1
array([False, False,  True,  True,  True], dtype=bool)

 m1 - (-m2)
array([False, False,  True,  True, False], dtype=bool)

I'd rather write ~ than unary - if that's what it is.

Josef



 Alan Isaac

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Alan G Isaac
On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
 unary versus binary minus

Oh right; I consider binary `-` broken for
Boolean arrays. (Sorry Alexander; I did not
see your entire issue.)


 I'd rather write ~ than unary - if that's what it is.

I agree.  So I have no objection to elimination
of the `-`.  I see it does the subtraction and then
a boolean conversion, which is not helpful.
Or rather, I do not see how it can be helpful.

Alan Isaac
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread josef . pktd
On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote:
 On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
 unary versus binary minus

 Oh right; I consider binary `-` broken for
 Boolean arrays. (Sorry Alexander; I did not
 see your entire issue.)


 I'd rather write ~ than unary - if that's what it is.

 I agree.  So I have no objection to elimination
 of the `-`.  I see it does the subtraction and then
 a boolean conversion, which is not helpful.
 Or rather, I do not see how it can be helpful.

What I would or might find useful is if binary `-` subtracts set
membership instead of doing xor

 m1 = np.array([0,0,1,1], bool)
 m2 = np.array([0,1,0,1], bool)
 m1 - m2
array([False,  True,  True, False], dtype=bool)
 np.logical_xor(m1, m2)
array([False,  True,  True, False], dtype=bool)

 np.clip(m1.astype(int) - m2.astype(int), 0, 1).astype(bool)
array([False, False,  True, False], dtype=bool)
 np.nonzero(_)[0]
array([2])

 s1 = set(np.arange(4)[m1])
 s2 = set(np.arange(4)[m2])
 s1 - s2
set([2])

Josef


 Alan Isaac
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Alexander Belopolsky
On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote:

 On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
  unary versus binary minus

 Oh right; I consider binary `-` broken for
 Boolean arrays. (Sorry Alexander; I did not
 see your entire issue.)


  I'd rather write ~ than unary - if that's what it is.

 I agree.  So I have no objection to elimination
 of the `-`.


It looks like we are close to reaching a consensus on the following points:

1. * is well-defined on boolean arrays and may be used in preference of 
in code that is designed to handle 1s and 0s of any dtype in addition to
booleans.

2. + is defined consistently with * and the only issue is the absence of
additive inverse.  This is not a problem as long as presence of - does not
suggest otherwise.

3. binary and unary minus should be deprecated because its use in
expressions where variables can be either boolean or numeric would lead to
subtle bugs.  For example -x*y would produce different results from -(x*y)
depending on whether x is boolean or not.  In all situations, ^ is
preferable to binary - and ~ is preferable to unary -.

4. changing boolean arithmetics to auto-promotion to int is precluded by a
significant use-case of boolean matrices.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Nathaniel Smith
On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com wrote:



 On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote:

 On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
  unary versus binary minus

 Oh right; I consider binary `-` broken for
 Boolean arrays. (Sorry Alexander; I did not
 see your entire issue.)


  I'd rather write ~ than unary - if that's what it is.

 I agree.  So I have no objection to elimination
 of the `-`.


 It looks like we are close to reaching a consensus on the following points:

 1. * is well-defined on boolean arrays and may be used in preference of  in
 code that is designed to handle 1s and 0s of any dtype in addition to
 booleans.

 2. + is defined consistently with * and the only issue is the absence of
 additive inverse.  This is not a problem as long as presence of - does not
 suggest otherwise.

 3. binary and unary minus should be deprecated because its use in
 expressions where variables can be either boolean or numeric would lead to
 subtle bugs.  For example -x*y would produce different results from -(x*y)
 depending on whether x is boolean or not.  In all situations, ^ is
 preferable to binary - and ~ is preferable to unary -.

 4. changing boolean arithmetics to auto-promotion to int is precluded by a
 significant use-case of boolean matrices.

+1

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread josef . pktd
On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote:
 On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com wrote:



 On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote:

 On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
  unary versus binary minus

 Oh right; I consider binary `-` broken for
 Boolean arrays. (Sorry Alexander; I did not
 see your entire issue.)


  I'd rather write ~ than unary - if that's what it is.

 I agree.  So I have no objection to elimination
 of the `-`.


 It looks like we are close to reaching a consensus on the following points:

 1. * is well-defined on boolean arrays and may be used in preference of  in
 code that is designed to handle 1s and 0s of any dtype in addition to
 booleans.

 2. + is defined consistently with * and the only issue is the absence of
 additive inverse.  This is not a problem as long as presence of - does not
 suggest otherwise.

 3. binary and unary minus should be deprecated because its use in
 expressions where variables can be either boolean or numeric would lead to
 subtle bugs.  For example -x*y would produce different results from -(x*y)
 depending on whether x is boolean or not.  In all situations, ^ is
 preferable to binary - and ~ is preferable to unary -.

 4. changing boolean arithmetics to auto-promotion to int is precluded by a
 significant use-case of boolean matrices.

 +1

+0.5
(I would still prefer a different binary minus, but it would be
inconsistent with a logical unary minus that negates.)

5. `/` is useless
6 `**` follows from 1.

Josef



 --
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Sebastian Berg
On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote:
  On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com 
  wrote:
 
 
 
  On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote:
 
  On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
   unary versus binary minus
 
  Oh right; I consider binary `-` broken for
  Boolean arrays. (Sorry Alexander; I did not
  see your entire issue.)
 
 
   I'd rather write ~ than unary - if that's what it is.
 
  I agree.  So I have no objection to elimination
  of the `-`.
 
 
  It looks like we are close to reaching a consensus on the following points:
 
  1. * is well-defined on boolean arrays and may be used in preference of  
  in
  code that is designed to handle 1s and 0s of any dtype in addition to
  booleans.
 
  2. + is defined consistently with * and the only issue is the absence of
  additive inverse.  This is not a problem as long as presence of - does not
  suggest otherwise.
 
  3. binary and unary minus should be deprecated because its use in
  expressions where variables can be either boolean or numeric would lead to
  subtle bugs.  For example -x*y would produce different results from -(x*y)
  depending on whether x is boolean or not.  In all situations, ^ is
  preferable to binary - and ~ is preferable to unary -.
 
  4. changing boolean arithmetics to auto-promotion to int is precluded by a
  significant use-case of boolean matrices.
 
  +1
 
 +0.5
 (I would still prefer a different binary minus, but it would be
 inconsistent with a logical unary minus that negates.)
 

The question is if the current xor behaviour can make sense? It doesn't
seem to make much sense mathematically? Which only leaves that `abs(x -
y)` is actually what a (python) programmer might expect.
I think I would like to deprecate at least the unary one. The ~ kind of
behaviour just doesn't fit as far as I can see. 

 5. `/` is useless
 6 `**` follows from 1.

Both of these are currently not defined, they will just cause upcast to
int8. I suppose it would be possible to deprecate that upcast though
(same goes for most all other ufuncs/operators in principle).

 
 Josef
 
 
 
  --
  Nathaniel J. Smith
  Postdoctoral researcher - Informatics - University of Edinburgh
  http://vorpus.org
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread josef . pktd
On Fri, Dec 6, 2013 at 3:50 PM, Sebastian Berg
sebast...@sipsolutions.net wrote:
 On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote:
  On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com 
  wrote:
 
 
 
  On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote:
 
  On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
   unary versus binary minus
 
  Oh right; I consider binary `-` broken for
  Boolean arrays. (Sorry Alexander; I did not
  see your entire issue.)
 
 
   I'd rather write ~ than unary - if that's what it is.
 
  I agree.  So I have no objection to elimination
  of the `-`.
 
 
  It looks like we are close to reaching a consensus on the following 
  points:
 
  1. * is well-defined on boolean arrays and may be used in preference of  
  in
  code that is designed to handle 1s and 0s of any dtype in addition to
  booleans.
 
  2. + is defined consistently with * and the only issue is the absence of
  additive inverse.  This is not a problem as long as presence of - does not
  suggest otherwise.
 
  3. binary and unary minus should be deprecated because its use in
  expressions where variables can be either boolean or numeric would lead to
  subtle bugs.  For example -x*y would produce different results from -(x*y)
  depending on whether x is boolean or not.  In all situations, ^ is
  preferable to binary - and ~ is preferable to unary -.
 
  4. changing boolean arithmetics to auto-promotion to int is precluded by a
  significant use-case of boolean matrices.
 
  +1

 +0.5
 (I would still prefer a different binary minus, but it would be
 inconsistent with a logical unary minus that negates.)


 The question is if the current xor behaviour can make sense? It doesn't
 seem to make much sense mathematically? Which only leaves that `abs(x -
 y)` is actually what a (python) programmer might expect.
 I think I would like to deprecate at least the unary one. The ~ kind of
 behaviour just doesn't fit as far as I can see.

I haven't seen any real use cases for xor yet.
My impression is that both plus and minus are just overflow accidents
and not intentional. plus works in a useful way, minus as xor might be
used once per century.

I would deprecate both unary and binary minus.

(And when nobody is looking in two versions from now, I would add a
binary minus that overflows to the clipped version, so I get a set
subtraction. :)


 5. `/` is useless
 6 `**` follows from 1.

 m1 ** m2
array([1, 0, 1, 1], dtype=int8)
 m1 ** 2
array([False, False,  True,  True], dtype=bool)
 m1 ** 3
array([0, 0, 1, 1])

but I'm using python with an old numpy right now
 np.__version__
'1.6.1'


 Both of these are currently not defined, they will just cause upcast to
 int8. I suppose it would be possible to deprecate that upcast though
 (same goes for most all other ufuncs/operators in principle).

We would have to start the discussion again for all other
operators/ufuncs to see if they are useful in some cases.
For most treating as int will make sense, I guess.

Josef



 Josef


 
  --
  Nathaniel J. Smith
  Postdoctoral researcher - Informatics - University of Edinburgh
  http://vorpus.org
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Alan G Isaac
On 12/6/2013 3:30 PM, josef.p...@gmail.com wrote:
 6 `**` follows from 1.


Yes, but what really matters is that
linalg.matrix_power
give the correct (boolean) result.

Alan

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Alan G Isaac
On 12/6/2013 3:50 PM, Sebastian Berg wrote:
 Both of these are currently not defined, they will just cause upcast to
 int8.


What does currently mean?
`**` works fine for boolean arrays in 1.7.1.
(It's useless, but it works.)

Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread josef . pktd
On Fri, Dec 6, 2013 at 4:14 PM,  josef.p...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 3:50 PM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote:
  On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com 
  wrote:
 
 
 
  On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com 
  wrote:
 
  On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
   unary versus binary minus
 
  Oh right; I consider binary `-` broken for
  Boolean arrays. (Sorry Alexander; I did not
  see your entire issue.)
 
 
   I'd rather write ~ than unary - if that's what it is.
 
  I agree.  So I have no objection to elimination
  of the `-`.
 
 
  It looks like we are close to reaching a consensus on the following 
  points:
 
  1. * is well-defined on boolean arrays and may be used in preference of 
   in
  code that is designed to handle 1s and 0s of any dtype in addition to
  booleans.
 
  2. + is defined consistently with * and the only issue is the absence of
  additive inverse.  This is not a problem as long as presence of - does 
  not
  suggest otherwise.
 
  3. binary and unary minus should be deprecated because its use in
  expressions where variables can be either boolean or numeric would lead 
  to
  subtle bugs.  For example -x*y would produce different results from 
  -(x*y)
  depending on whether x is boolean or not.  In all situations, ^ is
  preferable to binary - and ~ is preferable to unary -.
 
  4. changing boolean arithmetics to auto-promotion to int is precluded by 
  a
  significant use-case of boolean matrices.
 
  +1

 +0.5
 (I would still prefer a different binary minus, but it would be
 inconsistent with a logical unary minus that negates.)


 The question is if the current xor behaviour can make sense? It doesn't
 seem to make much sense mathematically? Which only leaves that `abs(x -
 y)` is actually what a (python) programmer might expect.
 I think I would like to deprecate at least the unary one. The ~ kind of
 behaviour just doesn't fit as far as I can see.

 I haven't seen any real use cases for xor yet.
 My impression is that both plus and minus are just overflow accidents
 and not intentional. plus works in a useful way, minus as xor might be
 used once per century.

 I would deprecate both unary and binary minus.

 (And when nobody is looking in two versions from now, I would add a
 binary minus that overflows to the clipped version, so I get a set
 subtraction. :)

Actually minus works as expected if we avoid negative overflow:

 m1 - m1*m2
array([False, False,  True, False], dtype=bool)
 m1 * ~m2
array([False, False,  True, False], dtype=bool)
 m1  ~m2
array([False, False,  True, False], dtype=bool)

I find the first easy to read, but m1 - m2 would be one operation
less, and chain more easily m1 - m2 - m3
m1 are mailing list subscribers, take away
m2 owners of apples, take away
m3 users of Linux
= exotic developers

Josef







 5. `/` is useless
 6 `**` follows from 1.

 m1 ** m2
 array([1, 0, 1, 1], dtype=int8)
 m1 ** 2
 array([False, False,  True,  True], dtype=bool)
 m1 ** 3
 array([0, 0, 1, 1])

 but I'm using python with an old numpy right now
 np.__version__
 '1.6.1'


 Both of these are currently not defined, they will just cause upcast to
 int8. I suppose it would be possible to deprecate that upcast though
 (same goes for most all other ufuncs/operators in principle).

 We would have to start the discussion again for all other
 operators/ufuncs to see if they are useful in some cases.
 For most treating as int will make sense, I guess.

 Josef



 Josef


 
  --
  Nathaniel J. Smith
  Postdoctoral researcher - Informatics - University of Edinburgh
  http://vorpus.org
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Nathaniel Smith
Not sure how much time it's worth spending on coming up with new
definitions for boolean subtraction, since even if we deprecate the
current behavior now we won't be able to implement any of them for a
year+, and then we'll end up having to go through these debates again
then anyway.

-n

On Fri, Dec 6, 2013 at 2:29 PM,  josef.p...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 4:14 PM,  josef.p...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 3:50 PM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote:
  On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com 
  wrote:
 
 
 
  On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com 
  wrote:
 
  On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
   unary versus binary minus
 
  Oh right; I consider binary `-` broken for
  Boolean arrays. (Sorry Alexander; I did not
  see your entire issue.)
 
 
   I'd rather write ~ than unary - if that's what it is.
 
  I agree.  So I have no objection to elimination
  of the `-`.
 
 
  It looks like we are close to reaching a consensus on the following 
  points:
 
  1. * is well-defined on boolean arrays and may be used in preference of 
   in
  code that is designed to handle 1s and 0s of any dtype in addition to
  booleans.
 
  2. + is defined consistently with * and the only issue is the absence of
  additive inverse.  This is not a problem as long as presence of - does 
  not
  suggest otherwise.
 
  3. binary and unary minus should be deprecated because its use in
  expressions where variables can be either boolean or numeric would lead 
  to
  subtle bugs.  For example -x*y would produce different results from 
  -(x*y)
  depending on whether x is boolean or not.  In all situations, ^ is
  preferable to binary - and ~ is preferable to unary -.
 
  4. changing boolean arithmetics to auto-promotion to int is precluded 
  by a
  significant use-case of boolean matrices.
 
  +1

 +0.5
 (I would still prefer a different binary minus, but it would be
 inconsistent with a logical unary minus that negates.)


 The question is if the current xor behaviour can make sense? It doesn't
 seem to make much sense mathematically? Which only leaves that `abs(x -
 y)` is actually what a (python) programmer might expect.
 I think I would like to deprecate at least the unary one. The ~ kind of
 behaviour just doesn't fit as far as I can see.

 I haven't seen any real use cases for xor yet.
 My impression is that both plus and minus are just overflow accidents
 and not intentional. plus works in a useful way, minus as xor might be
 used once per century.

 I would deprecate both unary and binary minus.

 (And when nobody is looking in two versions from now, I would add a
 binary minus that overflows to the clipped version, so I get a set
 subtraction. :)

 Actually minus works as expected if we avoid negative overflow:

 m1 - m1*m2
 array([False, False,  True, False], dtype=bool)
 m1 * ~m2
 array([False, False,  True, False], dtype=bool)
 m1  ~m2
 array([False, False,  True, False], dtype=bool)

 I find the first easy to read, but m1 - m2 would be one operation
 less, and chain more easily m1 - m2 - m3
 m1 are mailing list subscribers, take away
 m2 owners of apples, take away
 m3 users of Linux
 = exotic developers

 Josef







 5. `/` is useless
 6 `**` follows from 1.

 m1 ** m2
 array([1, 0, 1, 1], dtype=int8)
 m1 ** 2
 array([False, False,  True,  True], dtype=bool)
 m1 ** 3
 array([0, 0, 1, 1])

 but I'm using python with an old numpy right now
 np.__version__
 '1.6.1'


 Both of these are currently not defined, they will just cause upcast to
 int8. I suppose it would be possible to deprecate that upcast though
 (same goes for most all other ufuncs/operators in principle).

 We would have to start the discussion again for all other
 operators/ufuncs to see if they are useful in some cases.
 For most treating as int will make sense, I guess.

 Josef



 Josef


 
  --
  Nathaniel J. Smith
  Postdoctoral researcher - Informatics - University of Edinburgh
  http://vorpus.org
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___

Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread josef . pktd
On Fri, Dec 6, 2013 at 5:45 PM, Nathaniel Smith n...@pobox.com wrote:
 Not sure how much time it's worth spending on coming up with new
 definitions for boolean subtraction, since even if we deprecate the
 current behavior now we won't be able to implement any of them for a
 year+, and then we'll end up having to go through these debates again
 then anyway.

I didn't argue against deprecation of the boolean minuses. I'm fine with that.

Just some early lobbying, and so I can save my examples where I can
google them in case I'm still around if or when we can revisit the
issue.
Once I turn of the python interpreter that I used for the examples, I
will forget everything about weird boolean operations.

One advantage of this thread is that I had to look up the math for
indicator functions, and that I have a better idea where I could use
logical operators instead of (linear) algebra.

Josef


 -n

 On Fri, Dec 6, 2013 at 2:29 PM,  josef.p...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 4:14 PM,  josef.p...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 3:50 PM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote:
  On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com 
  wrote:
 
 
 
  On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com 
  wrote:
 
  On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
   unary versus binary minus
 
  Oh right; I consider binary `-` broken for
  Boolean arrays. (Sorry Alexander; I did not
  see your entire issue.)
 
 
   I'd rather write ~ than unary - if that's what it is.
 
  I agree.  So I have no objection to elimination
  of the `-`.
 
 
  It looks like we are close to reaching a consensus on the following 
  points:
 
  1. * is well-defined on boolean arrays and may be used in preference 
  of  in
  code that is designed to handle 1s and 0s of any dtype in addition to
  booleans.
 
  2. + is defined consistently with * and the only issue is the absence 
  of
  additive inverse.  This is not a problem as long as presence of - does 
  not
  suggest otherwise.
 
  3. binary and unary minus should be deprecated because its use in
  expressions where variables can be either boolean or numeric would 
  lead to
  subtle bugs.  For example -x*y would produce different results from 
  -(x*y)
  depending on whether x is boolean or not.  In all situations, ^ is
  preferable to binary - and ~ is preferable to unary -.
 
  4. changing boolean arithmetics to auto-promotion to int is precluded 
  by a
  significant use-case of boolean matrices.
 
  +1

 +0.5
 (I would still prefer a different binary minus, but it would be
 inconsistent with a logical unary minus that negates.)


 The question is if the current xor behaviour can make sense? It doesn't
 seem to make much sense mathematically? Which only leaves that `abs(x -
 y)` is actually what a (python) programmer might expect.
 I think I would like to deprecate at least the unary one. The ~ kind of
 behaviour just doesn't fit as far as I can see.

 I haven't seen any real use cases for xor yet.
 My impression is that both plus and minus are just overflow accidents
 and not intentional. plus works in a useful way, minus as xor might be
 used once per century.

 I would deprecate both unary and binary minus.

 (And when nobody is looking in two versions from now, I would add a
 binary minus that overflows to the clipped version, so I get a set
 subtraction. :)

 Actually minus works as expected if we avoid negative overflow:

 m1 - m1*m2
 array([False, False,  True, False], dtype=bool)
 m1 * ~m2
 array([False, False,  True, False], dtype=bool)
 m1  ~m2
 array([False, False,  True, False], dtype=bool)

 I find the first easy to read, but m1 - m2 would be one operation
 less, and chain more easily m1 - m2 - m3
 m1 are mailing list subscribers, take away
 m2 owners of apples, take away
 m3 users of Linux
 = exotic developers

 Josef







 5. `/` is useless
 6 `**` follows from 1.

 m1 ** m2
 array([1, 0, 1, 1], dtype=int8)
 m1 ** 2
 array([False, False,  True,  True], dtype=bool)
 m1 ** 3
 array([0, 0, 1, 1])

 but I'm using python with an old numpy right now
 np.__version__
 '1.6.1'


 Both of these are currently not defined, they will just cause upcast to
 int8. I suppose it would be possible to deprecate that upcast though
 (same goes for most all other ufuncs/operators in principle).

 We would have to start the discussion again for all other
 operators/ufuncs to see if they are useful in some cases.
 For most treating as int will make sense, I guess.

 Josef



 Josef


 
  --
  Nathaniel J. Smith
  Postdoctoral researcher - Informatics - University of Edinburgh
  http://vorpus.org
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 

Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Charles R Harris
On Fri, Dec 6, 2013 at 2:14 PM, josef.p...@gmail.com wrote:

 On Fri, Dec 6, 2013 at 3:50 PM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
  On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote:
  On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote:
   On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky 
 ndar...@mac.com wrote:
  
  
  
   On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com
 wrote:
  
   On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
unary versus binary minus
  
   Oh right; I consider binary `-` broken for
   Boolean arrays. (Sorry Alexander; I did not
   see your entire issue.)
  
  
I'd rather write ~ than unary - if that's what it is.
  
   I agree.  So I have no objection to elimination
   of the `-`.
  
  
   It looks like we are close to reaching a consensus on the following
 points:
  
   1. * is well-defined on boolean arrays and may be used in preference
 of  in
   code that is designed to handle 1s and 0s of any dtype in addition to
   booleans.
  
   2. + is defined consistently with * and the only issue is the
 absence of
   additive inverse.  This is not a problem as long as presence of -
 does not
   suggest otherwise.
  
   3. binary and unary minus should be deprecated because its use in
   expressions where variables can be either boolean or numeric would
 lead to
   subtle bugs.  For example -x*y would produce different results from
 -(x*y)
   depending on whether x is boolean or not.  In all situations, ^ is
   preferable to binary - and ~ is preferable to unary -.
  
   4. changing boolean arithmetics to auto-promotion to int is
 precluded by a
   significant use-case of boolean matrices.
  
   +1
 
  +0.5
  (I would still prefer a different binary minus, but it would be
  inconsistent with a logical unary minus that negates.)
 
 
  The question is if the current xor behaviour can make sense? It doesn't
  seem to make much sense mathematically? Which only leaves that `abs(x -
  y)` is actually what a (python) programmer might expect.
  I think I would like to deprecate at least the unary one. The ~ kind of
  behaviour just doesn't fit as far as I can see.

 I haven't seen any real use cases for xor yet.


Using it instead of '+' yields a boolean ring instead of semi-ring. Papers
from the first quarter of the last century used it pretty often on that
account, hence 'sigma-rings', etc. Eventually the simplicity of the
inclusive or overcame that tendency.

My impression is that both plus and minus are just overflow accidents
 and not intentional. plus works in a useful way, minus as xor might be
 used once per century.


It's certainly weird given that '+' means the inclusive or. I think '^' is
much preferable.
Although it makes some sense if one can keep the semantics straight.
Complicated, though.


 I would deprecate both unary and binary minus.

 (And when nobody is looking in two versions from now, I would add a
 binary minus that overflows to the clipped version, so I get a set
 subtraction. :)


Where is '\' when you need it?

snip

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread Sebastian Berg
Hey,

there was a discussion that for numpy booleans math operators +,-,* (and
the unary -), while defined, are not very helpful. I have set up a quick
PR with start (needs some fixes inside numpy still):

https://github.com/numpy/numpy/pull/4105

The idea is to deprecate these, since the binary operators |,^,| (and
the unary ~ even if it is weird) behave identical. This would not affect
sums of boolean arrays. For the moment I saw one annoying change in
numpy, and that is `abs(x - y)` being used for allclose and working
nicely currently.

- Sebastian

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread Alexander Belopolsky
On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg
sebast...@sipsolutions.netwrote:

 For the moment I saw one annoying change in
 numpy, and that is `abs(x - y)` being used for allclose and working
 nicely currently.


It would probably be an improvement if allclose returned all(x == y) unless
one of the arguments is inexact.  At the moment allclose() fails for char
arrays:

 allclose('abc', 'abc')
Traceback (most recent call last):
  File stdin, line 1, in module
  File numpy/core/numeric.py, line 2114, in allclose
xinf = isinf(x)
TypeError: Not implemented for this type
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread josef . pktd
On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg
sebast...@sipsolutions.net wrote:
 Hey,

 there was a discussion that for numpy booleans math operators +,-,* (and
 the unary -), while defined, are not very helpful. I have set up a quick
 PR with start (needs some fixes inside numpy still):

 https://github.com/numpy/numpy/pull/4105

 The idea is to deprecate these, since the binary operators |,^,| (and
 the unary ~ even if it is weird) behave identical. This would not affect
 sums of boolean arrays. For the moment I saw one annoying change in
 numpy, and that is `abs(x - y)` being used for allclose and working
 nicely currently.

I like mask = mask1 * mask2

That's what I learned working my way through scipy.stats.distributions
a long time ago.

But the main thing is that we use boolean often as 0,1 integer array
in the actual calculations, and I only sometimes add the astype(int)

x[:, None] * (y[:, None] == np.unique(y))

I always thought booleans *are* just 0, 1 integers, until last time
there was the discussion we saw the weird + or - behavior.

We also use rescaling to (-1, 1) in statsmodels   y = mask * 2 - 1
(but maybe we convert to integer first)
My guess is that I only use multiplication heavily, where the boolean
is a dummy variable with 0 if male and 1 if female for example.

Nothing serious but nice not to have to worry about casting with
astype(int) first.

x[:, None] * (y[:, None] == np.unique(y)).astype(int) (Is the
bracket at the right spot ?)

Josef



 - Sebastian

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread josef . pktd
On Thu, Dec 5, 2013 at 10:33 PM,  josef.p...@gmail.com wrote:
 On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 Hey,

 there was a discussion that for numpy booleans math operators +,-,* (and
 the unary -), while defined, are not very helpful. I have set up a quick
 PR with start (needs some fixes inside numpy still):

 https://github.com/numpy/numpy/pull/4105

 The idea is to deprecate these, since the binary operators |,^,| (and
 the unary ~ even if it is weird) behave identical. This would not affect
 sums of boolean arrays. For the moment I saw one annoying change in
 numpy, and that is `abs(x - y)` being used for allclose and working
 nicely currently.

 I like mask = mask1 * mask2

 That's what I learned working my way through scipy.stats.distributions
 a long time ago.

 But the main thing is that we use boolean often as 0,1 integer array
 in the actual calculations, and I only sometimes add the astype(int)

 x[:, None] * (y[:, None] == np.unique(y))

 I always thought booleans *are* just 0, 1 integers, until last time
 there was the discussion we saw the weird + or - behavior.

 We also use rescaling to (-1, 1) in statsmodels   y = mask * 2 - 1
 (but maybe we convert to integer first)
 My guess is that I only use multiplication heavily, where the boolean
 is a dummy variable with 0 if male and 1 if female for example.

 Nothing serious but nice not to have to worry about casting with
 astype(int) first.

 x[:, None] * (y[:, None] == np.unique(y)).astype(int) (Is the
 bracket at the right spot ?)


what about np.dot,np.dot(mask, x) which is the same as (mask * x).sum(0) ?

Josef


 Josef



 - Sebastian

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread Alexander Belopolsky
On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net
wrote:
 there was a discussion that for numpy booleans math operators +,-,* (and
 the unary -), while defined, are not very helpful.

It has been suggested at the Github that there is an area where it is
useful to have linear algebra operations like matrix multiplication to be
defined over a semiring:

http://en.wikipedia.org/wiki/Logical_matrix

This still does not justify having unary or binary -, so I suggest that we
first discuss deprecation of those.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread Alexander Belopolsky
On Thu, Dec 5, 2013 at 10:35 PM, josef.p...@gmail.com wrote:

 what about np.dot,np.dot(mask, x) which is the same as (mask *
 x).sum(0) ?


I am not sure which way your argument goes, but I don't think you would
find the following natural:

 x = array([True, True])
 dot(x,x)
True
 (x*x).sum()
2
 (x*x).sum(0)
2
 (x*x).sum(False)
2
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread josef . pktd
On Thu, Dec 5, 2013 at 10:56 PM, Alexander Belopolsky ndar...@mac.com wrote:
 On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net
 wrote:
 there was a discussion that for numpy booleans math operators +,-,* (and
 the unary -), while defined, are not very helpful.

 It has been suggested at the Github that there is an area where it is useful
 to have linear algebra operations like matrix multiplication to be defined
 over a semiring:

 http://en.wikipedia.org/wiki/Logical_matrix

 This still does not justify having unary or binary -, so I suggest that we
 first discuss deprecation of those.

Does it make sense to only remove - and maybe / ?

would python sum still work?   (I almost never use it.)

 sum(mask)
2
 sum(mask.tolist())
2

is accumulate the same as sum and would keep working?

 np.add.accumulate(mask)
array([0, 0, 0, 1, 2])


In operation with other dtypes, do they still dominate so these work?

 x / mask
array([0, 0, 0, 3, 4])
 x * 1. / mask
array([ nan,  inf,  inf,   3.,   4.])
 x**mask
array([1, 1, 1, 3, 4])
 mask - 5
array([-5, -5, -5, -4, -4])

Josef


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread Alan G Isaac
For + and * (and thus `dot`), this will fix something that is not broken.
It is in fact in conformance with a large literature on boolean arrays
and boolean matrices.  That not everyone pays attention to this literature
does not constitute a reason to break the extant, correct behavior.

I'm sure I cannot be the only one who has for years taught students
about Boolean matrices using NumPy, because of this correct behavior
of this dtype. (By correct, I mean in conformance with the literature.)

Alan Isaac
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread Alexander Belopolsky
On Thu, Dec 5, 2013 at 11:05 PM, Alan G Isaac alan.is...@gmail.com wrote:

 For + and * (and thus `dot`), this will fix something that is not broken.


+ and * are not broken - just redundant given | and .

What is really broken is -, both unary and binary:

 int(np.bool_(0) - np.bool_(1))
1
 int(-np.bool_(0))
1

 I'm sure I cannot be the only one who has for years taught students
 about Boolean matrices using NumPy

(I would not be so sure:-)

In that experience, did you find minus to be as useful?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread josef . pktd
On Thu, Dec 5, 2013 at 11:00 PM, Alexander Belopolsky ndar...@mac.com wrote:

 On Thu, Dec 5, 2013 at 10:35 PM, josef.p...@gmail.com wrote:

 what about np.dot,np.dot(mask, x) which is the same as (mask *
 x).sum(0) ?


 I am not sure which way your argument goes, but I don't think you would find
 the following natural:

 x = array([True, True])
 dot(x,x)
 True

this is weird but I would never do that.  maybe I would, but then i
would add 1 non boolean


 (x*x).sum()
 2
 (x*x).sum(0)
 2

That sounds right to me
 (mask**2 == mask).all()
True


 (x*x).sum(False)
 2

What is axis=False?


The way my argument goes:
I'm a heavy user of using * pretending the bool behaves like an int,
and of sum and accumulate.
It would be a pain to loose them.

From where I come from (*) a bool is not a boolean it's just 0, 1,
given that numpy casting rules apply and it's sometimes cast back to
(0, 1)   Does this work as explanation for the pattern of + and -
also.

(*) places where the type system is more restricted.


What about max?

 np.maximum(mask, mask)
array([False, False, False,  True,  True], dtype=bool)
 np.maximum(mask, ~mask)
array([ True,  True,  True,  True,  True], dtype=bool)
 mask + mask
array([False, False, False,  True,  True], dtype=bool)
 mask + ~mask
array([ True,  True,  True,  True,  True], dtype=bool)

first mask is if the wife has a car, second mask is if the husband has a car.
The max is if there is a car in the family.

What's this as logical?

Josef



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion