Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 11:25 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Dec 6, 2013 at 2:14 PM, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 3:50 PM, Sebastian Berg sebast...@sipsolutions.net wrote: On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com wrote: On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote: On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote: unary versus binary minus Oh right; I consider binary `-` broken for Boolean arrays. (Sorry Alexander; I did not see your entire issue.) I'd rather write ~ than unary - if that's what it is. I agree. So I have no objection to elimination of the `-`. It looks like we are close to reaching a consensus on the following points: 1. * is well-defined on boolean arrays and may be used in preference of in code that is designed to handle 1s and 0s of any dtype in addition to booleans. 2. + is defined consistently with * and the only issue is the absence of additive inverse. This is not a problem as long as presence of - does not suggest otherwise. 3. binary and unary minus should be deprecated because its use in expressions where variables can be either boolean or numeric would lead to subtle bugs. For example -x*y would produce different results from -(x*y) depending on whether x is boolean or not. In all situations, ^ is preferable to binary - and ~ is preferable to unary -. 4. changing boolean arithmetics to auto-promotion to int is precluded by a significant use-case of boolean matrices. +1 +0.5 (I would still prefer a different binary minus, but it would be inconsistent with a logical unary minus that negates.) The question is if the current xor behaviour can make sense? It doesn't seem to make much sense mathematically? Which only leaves that `abs(x - y)` is actually what a (python) programmer might expect. I think I would like to deprecate at least the unary one. The ~ kind of behaviour just doesn't fit as far as I can see. I haven't seen any real use cases for xor yet. Using it instead of '+' yields a boolean ring instead of semi-ring. Papers from the first quarter of the last century used it pretty often on that account, hence 'sigma-rings', etc. Eventually the simplicity of the inclusive or overcame that tendency. My impression is that both plus and minus are just overflow accidents and not intentional. plus works in a useful way, minus as xor might be used once per century. It's certainly weird given that '+' means the inclusive or. I think '^' is much preferable. Although it makes some sense if one can keep the semantics straight. Complicated, though. I'm looking at the test failure with allclose Looks like - as xor still makes sense in some cases, because it doesn't need special cases for equality checks for example. x - y == 0 iff x == y What happens to np.diff? np.diff(m1) array([False, True, False], dtype=bool) I'm using code like that to get the length of runs in a runstest. But in my current code, I actually have astype(int) and also use floats later on. If I read my (incompletely documented) code correctly, I needed also the sign not just the run length. Just another argument what minus could be. Josef I would deprecate both unary and binary minus. (And when nobody is looking in two versions from now, I would add a binary minus that overflows to the clipped version, so I get a set subtraction. :) Where is '\' when you need it? snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Thu, Dec 5, 2013 at 7:33 PM, josef.p...@gmail.com wrote: On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey, there was a discussion that for numpy booleans math operators +,-,* (and the unary -), while defined, are not very helpful. I have set up a quick PR with start (needs some fixes inside numpy still): https://github.com/numpy/numpy/pull/4105 The idea is to deprecate these, since the binary operators |,^,| (and the unary ~ even if it is weird) behave identical. This would not affect sums of boolean arrays. For the moment I saw one annoying change in numpy, and that is `abs(x - y)` being used for allclose and working nicely currently. I like mask = mask1 * mask2 That's what I learned working my way through scipy.stats.distributions a long time ago. * is least problematic case, since there numpy and python bools already almost agree. (They return the same values, but numpy returns a bool array instead of an integer array.) On Thu, Dec 5, 2013 at 8:05 PM, Alan G Isaac alan.is...@gmail.com wrote: For + and * (and thus `dot`), this will fix something that is not broken. It is in fact in conformance with a large literature on boolean arrays and boolean matrices. Interesting point! I had assumed that dot() just upcast! But what do you think about the inconsistency between sum() and dot() on bool arrays? -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Thu, 2013-12-05 at 23:02 -0500, josef.p...@gmail.com wrote: On Thu, Dec 5, 2013 at 10:56 PM, Alexander Belopolsky ndar...@mac.com wrote: On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net wrote: there was a discussion that for numpy booleans math operators +,-,* (and the unary -), while defined, are not very helpful. It has been suggested at the Github that there is an area where it is useful to have linear algebra operations like matrix multiplication to be defined over a semiring: http://en.wikipedia.org/wiki/Logical_matrix This still does not justify having unary or binary -, so I suggest that we first discuss deprecation of those. Does it make sense to only remove - and maybe / ? would python sum still work? (I almost never use it.) sum(mask) 2 sum(mask.tolist()) 2 is accumulate the same as sum and would keep working? np.add.accumulate(mask) array([0, 0, 0, 1, 2]) In operation with other dtypes, do they still dominate so these work? Hey, of course the other types will always dominate interpreting bools as 0 and 1. This would only affect operations with only booleans. There is a good point that * is well defined however you define it, though. (Btw. / is not defined for bools, `np.bool_(True)/np.bool_(True)` will upcast to int8 to do the operation) However, while well defined, + is not defined like it is for python bools (which are just ints) so that is the reason to consider deprecation there (if we allow upcast to int8 -- or maybe the default int -- in the future, in-place += and -= operations would not behave differently, since they just cast back...). I suppose python sum works because it first tries using the C-Api number protocol, which also means it is not affected. If you were to write a sum which just uses the `+` operator, it would be affected, but that would seem good to me. - Sebastian x / mask array([0, 0, 0, 3, 4]) x * 1. / mask array([ nan, inf, inf, 3., 4.]) x**mask array([1, 1, 1, 3, 4]) mask - 5 array([-5, -5, -5, -4, -4]) Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 4:39 AM, Sebastian Berg sebast...@sipsolutions.net wrote: On Thu, 2013-12-05 at 23:02 -0500, josef.p...@gmail.com wrote: On Thu, Dec 5, 2013 at 10:56 PM, Alexander Belopolsky ndar...@mac.com wrote: On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net wrote: there was a discussion that for numpy booleans math operators +,-,* (and the unary -), while defined, are not very helpful. It has been suggested at the Github that there is an area where it is useful to have linear algebra operations like matrix multiplication to be defined over a semiring: http://en.wikipedia.org/wiki/Logical_matrix This still does not justify having unary or binary -, so I suggest that we first discuss deprecation of those. Does it make sense to only remove - and maybe / ? would python sum still work? (I almost never use it.) sum(mask) 2 sum(mask.tolist()) 2 is accumulate the same as sum and would keep working? np.add.accumulate(mask) array([0, 0, 0, 1, 2]) In operation with other dtypes, do they still dominate so these work? Hey, In statistics and econometrics (and economic theory) we just use an indicator function 1_{x=5} which has largely the same properties as a numpy bool array, at least in my code. some of the common operations are *, dot and kron. So far this has worked quite well as intuition, plus numpy casting rules. dot is the main surprise, because I thought that it would upcast. (I always think of dot as a np.linalg.) of course the other types will always dominate interpreting bools as 0 and 1. This would only affect operations with only booleans. My guess is that this would leave then 90% of our (statsmodels) possible usage alone. There is still the case that with * we can calculate the intersection. There is a good point that * is well defined however you define it, though. (Btw. / is not defined for bools, `np.bool_(True)/np.bool_(True)` will upcast to int8 to do the operation) However, while well defined, + is not defined like it is for python bools (which are just ints) so that is the reason to consider deprecation there (if we allow upcast to int8 -- or maybe the default int -- in the future, in-place += and -= operations would not behave differently, since they just cast back...). Actually, I used + once: The calculation in terms of indicator functions is 1_{A} + 1_{B} - 1_{A B} The last part avoids double counting, which is not necessary if numpy casts back to bool. Nothing that couldn't be replaced by logical operators, but the (linear) algebra is not logical. In this case I did care about memory because the arrays are (nobs, nobs) (nobs is the number of observations shape[0]) which can be large, and I have a sparse version also. In most other case we use astype(int) already very often, because eventually we still have to cast and memory won't be a big problem. The mental model is set membership and set operations with indicator functions, not logical, and I don't remember running into problems with this so far, and happily ignored logical_xxx when I do linear algebra instead of just working with masks of booleans. Nevertheless: If I'm forced to, then I will get used to logical_xxx. (*) And the above bool addition hasn't made it into statsmodels yet. I used a simpler version because I thought initially it's too cute. (And I was using an older numpy that couldn't do broadcasted dot.) (*) how do you search in the documentation of `` or `|`, I cannot find what the other symbols are, if there are any. I suppose python sum works because it first tries using the C-Api number protocol, which also means it is not affected. If you were to write a sum which just uses the `+` operator, it would be affected, but that would seem good to me. based on the ticket example, I'm not sure whether `+` should upcast or not. mm.dtype dtype('bool') mm.sum(0) array([48, 45, 56, 47]) mm.sum(0, bool) array([ True, True, True, True], dtype=bool) I would just use any but what happens with logical cumsum mm[:5].cumsum(0, bool) array([[False, True, True, True], [ True, True, True, True], [ True, True, True, True], [ True, True, True, True], [ True, True, True, True]], dtype=bool) same as mm[:5].astype(int).cumsum(0, bool) without casting Josef - Sebastian x / mask array([0, 0, 0, 3, 4]) x * 1. / mask array([ nan, inf, inf, 3., 4.]) x**mask array([1, 1, 1, 3, 4]) mask - 5 array([-5, -5, -5, -4, -4]) Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list
Re: [Numpy-discussion] Deprecate boolean math operators?
On Thu, Dec 5, 2013 at 8:05 PM, Alan G Isaac alan.is...@gmail.com wrote: For + and * (and thus `dot`), this will fix something that is not broken. It is in fact in conformance with a large literature on boolean arrays and boolean matrices. On 12/6/2013 3:24 AM, Nathaniel Smith wrote: Interesting point! I had assumed that dot() just upcast! But what do you think about the inconsistency between sum() and dot() on bool arrays? I don't like the behavior of sum on bool arrays. (I.e., automatic upcasting.) But I do not suggest changing it, as much code is likely to depend on it. Cheers, Alan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 9:32 AM, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 4:39 AM, Sebastian Berg sebast...@sipsolutions.net wrote: On Thu, 2013-12-05 at 23:02 -0500, josef.p...@gmail.com wrote: On Thu, Dec 5, 2013 at 10:56 PM, Alexander Belopolsky ndar...@mac.com wrote: On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net wrote: there was a discussion that for numpy booleans math operators +,-,* (and the unary -), while defined, are not very helpful. It has been suggested at the Github that there is an area where it is useful to have linear algebra operations like matrix multiplication to be defined over a semiring: http://en.wikipedia.org/wiki/Logical_matrix This still does not justify having unary or binary -, so I suggest that we first discuss deprecation of those. Does it make sense to only remove - and maybe / ? would python sum still work? (I almost never use it.) sum(mask) 2 sum(mask.tolist()) 2 is accumulate the same as sum and would keep working? np.add.accumulate(mask) array([0, 0, 0, 1, 2]) In operation with other dtypes, do they still dominate so these work? Hey, In statistics and econometrics (and economic theory) we just use an indicator function 1_{x=5} which has largely the same properties as a numpy bool array, at least in my code. some of the common operations are *, dot and kron. So far this has worked quite well as intuition, plus numpy casting rules. dot is the main surprise, because I thought that it would upcast. (I always think of dot as a np.linalg.) of course the other types will always dominate interpreting bools as 0 and 1. This would only affect operations with only booleans. My guess is that this would leave then 90% of our (statsmodels) possible usage alone. There is still the case that with * we can calculate the intersection. There is a good point that * is well defined however you define it, though. (Btw. / is not defined for bools, `np.bool_(True)/np.bool_(True)` will upcast to int8 to do the operation) However, while well defined, + is not defined like it is for python bools (which are just ints) so that is the reason to consider deprecation there (if we allow upcast to int8 -- or maybe the default int -- in the future, in-place += and -= operations would not behave differently, since they just cast back...). Actually, I used + once: The calculation in terms of indicator functions is 1_{A} + 1_{B} - 1_{A B} The last part avoids double counting, which is not necessary if numpy casts back to bool. Nothing that couldn't be replaced by logical operators, but the (linear) algebra is not logical. In this case I did care about memory because the arrays are (nobs, nobs) (nobs is the number of observations shape[0]) which can be large, and I have a sparse version also. In most other case we use astype(int) already very often, because eventually we still have to cast and memory won't be a big problem. The mental model is set membership and set operations with indicator functions, not logical, and I don't remember running into problems with this so far, and happily ignored logical_xxx when I do linear algebra instead of just working with masks of booleans. http://en.wikipedia.org/wiki/Indicator_function with the added advantage that we have also the version where + constrains to (0, 1). However `-` doesn't work properly because np.bool_(-5) True instead of False except in the case `1 - mask`. We really have two kinds of addition: bool sum: for indicating set membership counting sum: for counting number of elements. from my viewpoint: I would keep + and * since they work well (bool + and count +) minus - is partially broken and `/` looks useless this casts anyway 1 - m1 array([1, 1, 0, 0, 0]) and I never thought of doing this True - m1 array([ True, True, False, False, False], dtype=bool) (python set defines minus but raises error on plus) Josef Nevertheless: If I'm forced to, then I will get used to logical_xxx. (*) And the above bool addition hasn't made it into statsmodels yet. I used a simpler version because I thought initially it's too cute. (And I was using an older numpy that couldn't do broadcasted dot.) (*) how do you search in the documentation of `` or `|`, I cannot find what the other symbols are, if there are any. I suppose python sum works because it first tries using the C-Api number protocol, which also means it is not affected. If you were to write a sum which just uses the `+` operator, it would be affected, but that would seem good to me. based on the ticket example, I'm not sure whether `+` should upcast or not. mm.dtype dtype('bool') mm.sum(0) array([48, 45, 56, 47]) mm.sum(0, bool) array([ True, True, True, True], dtype=bool) I would just use any but what happens with logical cumsum mm[:5].cumsum(0, bool) array([[False, True, True, True], [ True,
Re: [Numpy-discussion] Deprecate boolean math operators?
On 12/5/2013 11:14 PM, Alexander Belopolsky wrote: did you find minus to be as useful? It is also a correct usage. I think a good approach to this is to first realize that there were good reasons for the current behavior. Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 11:13 AM, Alan G Isaac alan.is...@gmail.com wrote: On 12/5/2013 11:14 PM, Alexander Belopolsky wrote: did you find minus to be as useful? It is also a correct usage. I think a good approach to this is to first realize that there were good reasons for the current behavior. What's the meaning of minus? I cannot make much sense out of it, or come up with any use case. Josef Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 11:13 AM, Alan G Isaac alan.is...@gmail.com wrote: On 12/5/2013 11:14 PM, Alexander Belopolsky wrote: did you find minus to be as useful? It is also a correct usage. Can you provide a reference? I think a good approach to this is to first realize that there were good reasons for the current behavior. Maybe there were, in which case the current behavior should be documented somewhere. What is the rationale for this: -array(True) + array(True) True ? I am not aware of any algebraic system where unary minus denotes anything other than additive inverse. Having bools form a semiring under + and * is a fine (yet somewhat unusual) choice, but once you've made that choice you loose subtraction because True + x = True no longer has a unique solution. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 12:23 PM, Alexander Belopolsky ndar...@mac.com wrote: On Fri, Dec 6, 2013 at 11:13 AM, Alan G Isaac alan.is...@gmail.com wrote: On 12/5/2013 11:14 PM, Alexander Belopolsky wrote: did you find minus to be as useful? It is also a correct usage. Can you provide a reference? I think a good approach to this is to first realize that there were good reasons for the current behavior. Maybe there were, in which case the current behavior should be documented somewhere. What is the rationale for this: -array(True) + array(True) True ? I am not aware of any algebraic system where unary minus denotes anything other than additive inverse. I would be perfectly happy if numpy would cast (negative) overflow to the smallest value, instead of wrapping around. The same is true for integers. np.array(0, np.int8) - np.array(-128, np.int8) -128 - np.array(-128, np.int8) -128 Josef It's consistent. But does it make sense? Having bools form a semiring under + and * is a fine (yet somewhat unusual) choice, but once you've made that choice you loose subtraction because True + x = True no longer has a unique solution. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On 12/6/2013 12:23 PM, Alexander Belopolsky wrote: What is the rationale for this: -array(True) + array(True) True The minus is complementation. So you are just writing False or True Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On 12/5/2013 11:14 PM, Alexander Belopolsky wrote: did you find minus to be as useful? On Fri, Dec 6, 2013 at 11:13 AM, Alan G Isaac It is also a correct usage. On 12/6/2013 12:23 PM, Alexander Belopolsky wrote: Can you provide a reference? For use of the minus sign, I don't have one at hand, but a quick Google seach comes up with: http://www.csee.umbc.edu/~artola/fall02/BooleanAlgebra.ppt It is more common to use a superscript `c`, but that's just a notational issue. For multiplication, addition, and dot, you can see Ki Hang Kim's Boolean matrix Theory and Applications. Applications are endless and include graph theory and (then naturally) circuit design. Cheers, Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 1:16 PM, Alan G Isaac alan.is...@gmail.com wrote: On 12/6/2013 12:23 PM, Alexander Belopolsky wrote: What is the rationale for this: -array(True) + array(True) True The minus is complementation. So you are just writing False or True unary versus binary minus m1 + (-m2) array([False, False, True, True, True], dtype=bool) m1 - m2 array([ True, True, False, False, True], dtype=bool) -m2 + m1 array([False, False, True, True, True], dtype=bool) m1 - (-m2) array([False, False, True, True, False], dtype=bool) I'd rather write ~ than unary - if that's what it is. Josef Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote: unary versus binary minus Oh right; I consider binary `-` broken for Boolean arrays. (Sorry Alexander; I did not see your entire issue.) I'd rather write ~ than unary - if that's what it is. I agree. So I have no objection to elimination of the `-`. I see it does the subtraction and then a boolean conversion, which is not helpful. Or rather, I do not see how it can be helpful. Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote: On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote: unary versus binary minus Oh right; I consider binary `-` broken for Boolean arrays. (Sorry Alexander; I did not see your entire issue.) I'd rather write ~ than unary - if that's what it is. I agree. So I have no objection to elimination of the `-`. I see it does the subtraction and then a boolean conversion, which is not helpful. Or rather, I do not see how it can be helpful. What I would or might find useful is if binary `-` subtracts set membership instead of doing xor m1 = np.array([0,0,1,1], bool) m2 = np.array([0,1,0,1], bool) m1 - m2 array([False, True, True, False], dtype=bool) np.logical_xor(m1, m2) array([False, True, True, False], dtype=bool) np.clip(m1.astype(int) - m2.astype(int), 0, 1).astype(bool) array([False, False, True, False], dtype=bool) np.nonzero(_)[0] array([2]) s1 = set(np.arange(4)[m1]) s2 = set(np.arange(4)[m2]) s1 - s2 set([2]) Josef Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote: On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote: unary versus binary minus Oh right; I consider binary `-` broken for Boolean arrays. (Sorry Alexander; I did not see your entire issue.) I'd rather write ~ than unary - if that's what it is. I agree. So I have no objection to elimination of the `-`. It looks like we are close to reaching a consensus on the following points: 1. * is well-defined on boolean arrays and may be used in preference of in code that is designed to handle 1s and 0s of any dtype in addition to booleans. 2. + is defined consistently with * and the only issue is the absence of additive inverse. This is not a problem as long as presence of - does not suggest otherwise. 3. binary and unary minus should be deprecated because its use in expressions where variables can be either boolean or numeric would lead to subtle bugs. For example -x*y would produce different results from -(x*y) depending on whether x is boolean or not. In all situations, ^ is preferable to binary - and ~ is preferable to unary -. 4. changing boolean arithmetics to auto-promotion to int is precluded by a significant use-case of boolean matrices. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com wrote: On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote: On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote: unary versus binary minus Oh right; I consider binary `-` broken for Boolean arrays. (Sorry Alexander; I did not see your entire issue.) I'd rather write ~ than unary - if that's what it is. I agree. So I have no objection to elimination of the `-`. It looks like we are close to reaching a consensus on the following points: 1. * is well-defined on boolean arrays and may be used in preference of in code that is designed to handle 1s and 0s of any dtype in addition to booleans. 2. + is defined consistently with * and the only issue is the absence of additive inverse. This is not a problem as long as presence of - does not suggest otherwise. 3. binary and unary minus should be deprecated because its use in expressions where variables can be either boolean or numeric would lead to subtle bugs. For example -x*y would produce different results from -(x*y) depending on whether x is boolean or not. In all situations, ^ is preferable to binary - and ~ is preferable to unary -. 4. changing boolean arithmetics to auto-promotion to int is precluded by a significant use-case of boolean matrices. +1 -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com wrote: On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote: On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote: unary versus binary minus Oh right; I consider binary `-` broken for Boolean arrays. (Sorry Alexander; I did not see your entire issue.) I'd rather write ~ than unary - if that's what it is. I agree. So I have no objection to elimination of the `-`. It looks like we are close to reaching a consensus on the following points: 1. * is well-defined on boolean arrays and may be used in preference of in code that is designed to handle 1s and 0s of any dtype in addition to booleans. 2. + is defined consistently with * and the only issue is the absence of additive inverse. This is not a problem as long as presence of - does not suggest otherwise. 3. binary and unary minus should be deprecated because its use in expressions where variables can be either boolean or numeric would lead to subtle bugs. For example -x*y would produce different results from -(x*y) depending on whether x is boolean or not. In all situations, ^ is preferable to binary - and ~ is preferable to unary -. 4. changing boolean arithmetics to auto-promotion to int is precluded by a significant use-case of boolean matrices. +1 +0.5 (I would still prefer a different binary minus, but it would be inconsistent with a logical unary minus that negates.) 5. `/` is useless 6 `**` follows from 1. Josef -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com wrote: On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote: On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote: unary versus binary minus Oh right; I consider binary `-` broken for Boolean arrays. (Sorry Alexander; I did not see your entire issue.) I'd rather write ~ than unary - if that's what it is. I agree. So I have no objection to elimination of the `-`. It looks like we are close to reaching a consensus on the following points: 1. * is well-defined on boolean arrays and may be used in preference of in code that is designed to handle 1s and 0s of any dtype in addition to booleans. 2. + is defined consistently with * and the only issue is the absence of additive inverse. This is not a problem as long as presence of - does not suggest otherwise. 3. binary and unary minus should be deprecated because its use in expressions where variables can be either boolean or numeric would lead to subtle bugs. For example -x*y would produce different results from -(x*y) depending on whether x is boolean or not. In all situations, ^ is preferable to binary - and ~ is preferable to unary -. 4. changing boolean arithmetics to auto-promotion to int is precluded by a significant use-case of boolean matrices. +1 +0.5 (I would still prefer a different binary minus, but it would be inconsistent with a logical unary minus that negates.) The question is if the current xor behaviour can make sense? It doesn't seem to make much sense mathematically? Which only leaves that `abs(x - y)` is actually what a (python) programmer might expect. I think I would like to deprecate at least the unary one. The ~ kind of behaviour just doesn't fit as far as I can see. 5. `/` is useless 6 `**` follows from 1. Both of these are currently not defined, they will just cause upcast to int8. I suppose it would be possible to deprecate that upcast though (same goes for most all other ufuncs/operators in principle). Josef -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 3:50 PM, Sebastian Berg sebast...@sipsolutions.net wrote: On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com wrote: On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote: On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote: unary versus binary minus Oh right; I consider binary `-` broken for Boolean arrays. (Sorry Alexander; I did not see your entire issue.) I'd rather write ~ than unary - if that's what it is. I agree. So I have no objection to elimination of the `-`. It looks like we are close to reaching a consensus on the following points: 1. * is well-defined on boolean arrays and may be used in preference of in code that is designed to handle 1s and 0s of any dtype in addition to booleans. 2. + is defined consistently with * and the only issue is the absence of additive inverse. This is not a problem as long as presence of - does not suggest otherwise. 3. binary and unary minus should be deprecated because its use in expressions where variables can be either boolean or numeric would lead to subtle bugs. For example -x*y would produce different results from -(x*y) depending on whether x is boolean or not. In all situations, ^ is preferable to binary - and ~ is preferable to unary -. 4. changing boolean arithmetics to auto-promotion to int is precluded by a significant use-case of boolean matrices. +1 +0.5 (I would still prefer a different binary minus, but it would be inconsistent with a logical unary minus that negates.) The question is if the current xor behaviour can make sense? It doesn't seem to make much sense mathematically? Which only leaves that `abs(x - y)` is actually what a (python) programmer might expect. I think I would like to deprecate at least the unary one. The ~ kind of behaviour just doesn't fit as far as I can see. I haven't seen any real use cases for xor yet. My impression is that both plus and minus are just overflow accidents and not intentional. plus works in a useful way, minus as xor might be used once per century. I would deprecate both unary and binary minus. (And when nobody is looking in two versions from now, I would add a binary minus that overflows to the clipped version, so I get a set subtraction. :) 5. `/` is useless 6 `**` follows from 1. m1 ** m2 array([1, 0, 1, 1], dtype=int8) m1 ** 2 array([False, False, True, True], dtype=bool) m1 ** 3 array([0, 0, 1, 1]) but I'm using python with an old numpy right now np.__version__ '1.6.1' Both of these are currently not defined, they will just cause upcast to int8. I suppose it would be possible to deprecate that upcast though (same goes for most all other ufuncs/operators in principle). We would have to start the discussion again for all other operators/ufuncs to see if they are useful in some cases. For most treating as int will make sense, I guess. Josef Josef -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On 12/6/2013 3:30 PM, josef.p...@gmail.com wrote: 6 `**` follows from 1. Yes, but what really matters is that linalg.matrix_power give the correct (boolean) result. Alan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On 12/6/2013 3:50 PM, Sebastian Berg wrote: Both of these are currently not defined, they will just cause upcast to int8. What does currently mean? `**` works fine for boolean arrays in 1.7.1. (It's useless, but it works.) Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 4:14 PM, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 3:50 PM, Sebastian Berg sebast...@sipsolutions.net wrote: On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com wrote: On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote: On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote: unary versus binary minus Oh right; I consider binary `-` broken for Boolean arrays. (Sorry Alexander; I did not see your entire issue.) I'd rather write ~ than unary - if that's what it is. I agree. So I have no objection to elimination of the `-`. It looks like we are close to reaching a consensus on the following points: 1. * is well-defined on boolean arrays and may be used in preference of in code that is designed to handle 1s and 0s of any dtype in addition to booleans. 2. + is defined consistently with * and the only issue is the absence of additive inverse. This is not a problem as long as presence of - does not suggest otherwise. 3. binary and unary minus should be deprecated because its use in expressions where variables can be either boolean or numeric would lead to subtle bugs. For example -x*y would produce different results from -(x*y) depending on whether x is boolean or not. In all situations, ^ is preferable to binary - and ~ is preferable to unary -. 4. changing boolean arithmetics to auto-promotion to int is precluded by a significant use-case of boolean matrices. +1 +0.5 (I would still prefer a different binary minus, but it would be inconsistent with a logical unary minus that negates.) The question is if the current xor behaviour can make sense? It doesn't seem to make much sense mathematically? Which only leaves that `abs(x - y)` is actually what a (python) programmer might expect. I think I would like to deprecate at least the unary one. The ~ kind of behaviour just doesn't fit as far as I can see. I haven't seen any real use cases for xor yet. My impression is that both plus and minus are just overflow accidents and not intentional. plus works in a useful way, minus as xor might be used once per century. I would deprecate both unary and binary minus. (And when nobody is looking in two versions from now, I would add a binary minus that overflows to the clipped version, so I get a set subtraction. :) Actually minus works as expected if we avoid negative overflow: m1 - m1*m2 array([False, False, True, False], dtype=bool) m1 * ~m2 array([False, False, True, False], dtype=bool) m1 ~m2 array([False, False, True, False], dtype=bool) I find the first easy to read, but m1 - m2 would be one operation less, and chain more easily m1 - m2 - m3 m1 are mailing list subscribers, take away m2 owners of apples, take away m3 users of Linux = exotic developers Josef 5. `/` is useless 6 `**` follows from 1. m1 ** m2 array([1, 0, 1, 1], dtype=int8) m1 ** 2 array([False, False, True, True], dtype=bool) m1 ** 3 array([0, 0, 1, 1]) but I'm using python with an old numpy right now np.__version__ '1.6.1' Both of these are currently not defined, they will just cause upcast to int8. I suppose it would be possible to deprecate that upcast though (same goes for most all other ufuncs/operators in principle). We would have to start the discussion again for all other operators/ufuncs to see if they are useful in some cases. For most treating as int will make sense, I guess. Josef Josef -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
Not sure how much time it's worth spending on coming up with new definitions for boolean subtraction, since even if we deprecate the current behavior now we won't be able to implement any of them for a year+, and then we'll end up having to go through these debates again then anyway. -n On Fri, Dec 6, 2013 at 2:29 PM, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 4:14 PM, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 3:50 PM, Sebastian Berg sebast...@sipsolutions.net wrote: On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com wrote: On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote: On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote: unary versus binary minus Oh right; I consider binary `-` broken for Boolean arrays. (Sorry Alexander; I did not see your entire issue.) I'd rather write ~ than unary - if that's what it is. I agree. So I have no objection to elimination of the `-`. It looks like we are close to reaching a consensus on the following points: 1. * is well-defined on boolean arrays and may be used in preference of in code that is designed to handle 1s and 0s of any dtype in addition to booleans. 2. + is defined consistently with * and the only issue is the absence of additive inverse. This is not a problem as long as presence of - does not suggest otherwise. 3. binary and unary minus should be deprecated because its use in expressions where variables can be either boolean or numeric would lead to subtle bugs. For example -x*y would produce different results from -(x*y) depending on whether x is boolean or not. In all situations, ^ is preferable to binary - and ~ is preferable to unary -. 4. changing boolean arithmetics to auto-promotion to int is precluded by a significant use-case of boolean matrices. +1 +0.5 (I would still prefer a different binary minus, but it would be inconsistent with a logical unary minus that negates.) The question is if the current xor behaviour can make sense? It doesn't seem to make much sense mathematically? Which only leaves that `abs(x - y)` is actually what a (python) programmer might expect. I think I would like to deprecate at least the unary one. The ~ kind of behaviour just doesn't fit as far as I can see. I haven't seen any real use cases for xor yet. My impression is that both plus and minus are just overflow accidents and not intentional. plus works in a useful way, minus as xor might be used once per century. I would deprecate both unary and binary minus. (And when nobody is looking in two versions from now, I would add a binary minus that overflows to the clipped version, so I get a set subtraction. :) Actually minus works as expected if we avoid negative overflow: m1 - m1*m2 array([False, False, True, False], dtype=bool) m1 * ~m2 array([False, False, True, False], dtype=bool) m1 ~m2 array([False, False, True, False], dtype=bool) I find the first easy to read, but m1 - m2 would be one operation less, and chain more easily m1 - m2 - m3 m1 are mailing list subscribers, take away m2 owners of apples, take away m3 users of Linux = exotic developers Josef 5. `/` is useless 6 `**` follows from 1. m1 ** m2 array([1, 0, 1, 1], dtype=int8) m1 ** 2 array([False, False, True, True], dtype=bool) m1 ** 3 array([0, 0, 1, 1]) but I'm using python with an old numpy right now np.__version__ '1.6.1' Both of these are currently not defined, they will just cause upcast to int8. I suppose it would be possible to deprecate that upcast though (same goes for most all other ufuncs/operators in principle). We would have to start the discussion again for all other operators/ufuncs to see if they are useful in some cases. For most treating as int will make sense, I guess. Josef Josef -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 5:45 PM, Nathaniel Smith n...@pobox.com wrote: Not sure how much time it's worth spending on coming up with new definitions for boolean subtraction, since even if we deprecate the current behavior now we won't be able to implement any of them for a year+, and then we'll end up having to go through these debates again then anyway. I didn't argue against deprecation of the boolean minuses. I'm fine with that. Just some early lobbying, and so I can save my examples where I can google them in case I'm still around if or when we can revisit the issue. Once I turn of the python interpreter that I used for the examples, I will forget everything about weird boolean operations. One advantage of this thread is that I had to look up the math for indicator functions, and that I have a better idea where I could use logical operators instead of (linear) algebra. Josef -n On Fri, Dec 6, 2013 at 2:29 PM, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 4:14 PM, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 3:50 PM, Sebastian Berg sebast...@sipsolutions.net wrote: On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com wrote: On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote: On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote: unary versus binary minus Oh right; I consider binary `-` broken for Boolean arrays. (Sorry Alexander; I did not see your entire issue.) I'd rather write ~ than unary - if that's what it is. I agree. So I have no objection to elimination of the `-`. It looks like we are close to reaching a consensus on the following points: 1. * is well-defined on boolean arrays and may be used in preference of in code that is designed to handle 1s and 0s of any dtype in addition to booleans. 2. + is defined consistently with * and the only issue is the absence of additive inverse. This is not a problem as long as presence of - does not suggest otherwise. 3. binary and unary minus should be deprecated because its use in expressions where variables can be either boolean or numeric would lead to subtle bugs. For example -x*y would produce different results from -(x*y) depending on whether x is boolean or not. In all situations, ^ is preferable to binary - and ~ is preferable to unary -. 4. changing boolean arithmetics to auto-promotion to int is precluded by a significant use-case of boolean matrices. +1 +0.5 (I would still prefer a different binary minus, but it would be inconsistent with a logical unary minus that negates.) The question is if the current xor behaviour can make sense? It doesn't seem to make much sense mathematically? Which only leaves that `abs(x - y)` is actually what a (python) programmer might expect. I think I would like to deprecate at least the unary one. The ~ kind of behaviour just doesn't fit as far as I can see. I haven't seen any real use cases for xor yet. My impression is that both plus and minus are just overflow accidents and not intentional. plus works in a useful way, minus as xor might be used once per century. I would deprecate both unary and binary minus. (And when nobody is looking in two versions from now, I would add a binary minus that overflows to the clipped version, so I get a set subtraction. :) Actually minus works as expected if we avoid negative overflow: m1 - m1*m2 array([False, False, True, False], dtype=bool) m1 * ~m2 array([False, False, True, False], dtype=bool) m1 ~m2 array([False, False, True, False], dtype=bool) I find the first easy to read, but m1 - m2 would be one operation less, and chain more easily m1 - m2 - m3 m1 are mailing list subscribers, take away m2 owners of apples, take away m3 users of Linux = exotic developers Josef 5. `/` is useless 6 `**` follows from 1. m1 ** m2 array([1, 0, 1, 1], dtype=int8) m1 ** 2 array([False, False, True, True], dtype=bool) m1 ** 3 array([0, 0, 1, 1]) but I'm using python with an old numpy right now np.__version__ '1.6.1' Both of these are currently not defined, they will just cause upcast to int8. I suppose it would be possible to deprecate that upcast though (same goes for most all other ufuncs/operators in principle). We would have to start the discussion again for all other operators/ufuncs to see if they are useful in some cases. For most treating as int will make sense, I guess. Josef Josef -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Fri, Dec 6, 2013 at 2:14 PM, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 3:50 PM, Sebastian Berg sebast...@sipsolutions.net wrote: On Fri, 2013-12-06 at 15:30 -0500, josef.p...@gmail.com wrote: On Fri, Dec 6, 2013 at 2:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Dec 6, 2013 at 11:55 AM, Alexander Belopolsky ndar...@mac.com wrote: On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote: On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote: unary versus binary minus Oh right; I consider binary `-` broken for Boolean arrays. (Sorry Alexander; I did not see your entire issue.) I'd rather write ~ than unary - if that's what it is. I agree. So I have no objection to elimination of the `-`. It looks like we are close to reaching a consensus on the following points: 1. * is well-defined on boolean arrays and may be used in preference of in code that is designed to handle 1s and 0s of any dtype in addition to booleans. 2. + is defined consistently with * and the only issue is the absence of additive inverse. This is not a problem as long as presence of - does not suggest otherwise. 3. binary and unary minus should be deprecated because its use in expressions where variables can be either boolean or numeric would lead to subtle bugs. For example -x*y would produce different results from -(x*y) depending on whether x is boolean or not. In all situations, ^ is preferable to binary - and ~ is preferable to unary -. 4. changing boolean arithmetics to auto-promotion to int is precluded by a significant use-case of boolean matrices. +1 +0.5 (I would still prefer a different binary minus, but it would be inconsistent with a logical unary minus that negates.) The question is if the current xor behaviour can make sense? It doesn't seem to make much sense mathematically? Which only leaves that `abs(x - y)` is actually what a (python) programmer might expect. I think I would like to deprecate at least the unary one. The ~ kind of behaviour just doesn't fit as far as I can see. I haven't seen any real use cases for xor yet. Using it instead of '+' yields a boolean ring instead of semi-ring. Papers from the first quarter of the last century used it pretty often on that account, hence 'sigma-rings', etc. Eventually the simplicity of the inclusive or overcame that tendency. My impression is that both plus and minus are just overflow accidents and not intentional. plus works in a useful way, minus as xor might be used once per century. It's certainly weird given that '+' means the inclusive or. I think '^' is much preferable. Although it makes some sense if one can keep the semantics straight. Complicated, though. I would deprecate both unary and binary minus. (And when nobody is looking in two versions from now, I would add a binary minus that overflows to the clipped version, so I get a set subtraction. :) Where is '\' when you need it? snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Deprecate boolean math operators?
Hey, there was a discussion that for numpy booleans math operators +,-,* (and the unary -), while defined, are not very helpful. I have set up a quick PR with start (needs some fixes inside numpy still): https://github.com/numpy/numpy/pull/4105 The idea is to deprecate these, since the binary operators |,^,| (and the unary ~ even if it is weird) behave identical. This would not affect sums of boolean arrays. For the moment I saw one annoying change in numpy, and that is `abs(x - y)` being used for allclose and working nicely currently. - Sebastian ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.netwrote: For the moment I saw one annoying change in numpy, and that is `abs(x - y)` being used for allclose and working nicely currently. It would probably be an improvement if allclose returned all(x == y) unless one of the arguments is inexact. At the moment allclose() fails for char arrays: allclose('abc', 'abc') Traceback (most recent call last): File stdin, line 1, in module File numpy/core/numeric.py, line 2114, in allclose xinf = isinf(x) TypeError: Not implemented for this type ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey, there was a discussion that for numpy booleans math operators +,-,* (and the unary -), while defined, are not very helpful. I have set up a quick PR with start (needs some fixes inside numpy still): https://github.com/numpy/numpy/pull/4105 The idea is to deprecate these, since the binary operators |,^,| (and the unary ~ even if it is weird) behave identical. This would not affect sums of boolean arrays. For the moment I saw one annoying change in numpy, and that is `abs(x - y)` being used for allclose and working nicely currently. I like mask = mask1 * mask2 That's what I learned working my way through scipy.stats.distributions a long time ago. But the main thing is that we use boolean often as 0,1 integer array in the actual calculations, and I only sometimes add the astype(int) x[:, None] * (y[:, None] == np.unique(y)) I always thought booleans *are* just 0, 1 integers, until last time there was the discussion we saw the weird + or - behavior. We also use rescaling to (-1, 1) in statsmodels y = mask * 2 - 1 (but maybe we convert to integer first) My guess is that I only use multiplication heavily, where the boolean is a dummy variable with 0 if male and 1 if female for example. Nothing serious but nice not to have to worry about casting with astype(int) first. x[:, None] * (y[:, None] == np.unique(y)).astype(int) (Is the bracket at the right spot ?) Josef - Sebastian ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Thu, Dec 5, 2013 at 10:33 PM, josef.p...@gmail.com wrote: On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey, there was a discussion that for numpy booleans math operators +,-,* (and the unary -), while defined, are not very helpful. I have set up a quick PR with start (needs some fixes inside numpy still): https://github.com/numpy/numpy/pull/4105 The idea is to deprecate these, since the binary operators |,^,| (and the unary ~ even if it is weird) behave identical. This would not affect sums of boolean arrays. For the moment I saw one annoying change in numpy, and that is `abs(x - y)` being used for allclose and working nicely currently. I like mask = mask1 * mask2 That's what I learned working my way through scipy.stats.distributions a long time ago. But the main thing is that we use boolean often as 0,1 integer array in the actual calculations, and I only sometimes add the astype(int) x[:, None] * (y[:, None] == np.unique(y)) I always thought booleans *are* just 0, 1 integers, until last time there was the discussion we saw the weird + or - behavior. We also use rescaling to (-1, 1) in statsmodels y = mask * 2 - 1 (but maybe we convert to integer first) My guess is that I only use multiplication heavily, where the boolean is a dummy variable with 0 if male and 1 if female for example. Nothing serious but nice not to have to worry about casting with astype(int) first. x[:, None] * (y[:, None] == np.unique(y)).astype(int) (Is the bracket at the right spot ?) what about np.dot,np.dot(mask, x) which is the same as (mask * x).sum(0) ? Josef Josef - Sebastian ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net wrote: there was a discussion that for numpy booleans math operators +,-,* (and the unary -), while defined, are not very helpful. It has been suggested at the Github that there is an area where it is useful to have linear algebra operations like matrix multiplication to be defined over a semiring: http://en.wikipedia.org/wiki/Logical_matrix This still does not justify having unary or binary -, so I suggest that we first discuss deprecation of those. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Thu, Dec 5, 2013 at 10:35 PM, josef.p...@gmail.com wrote: what about np.dot,np.dot(mask, x) which is the same as (mask * x).sum(0) ? I am not sure which way your argument goes, but I don't think you would find the following natural: x = array([True, True]) dot(x,x) True (x*x).sum() 2 (x*x).sum(0) 2 (x*x).sum(False) 2 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Thu, Dec 5, 2013 at 10:56 PM, Alexander Belopolsky ndar...@mac.com wrote: On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net wrote: there was a discussion that for numpy booleans math operators +,-,* (and the unary -), while defined, are not very helpful. It has been suggested at the Github that there is an area where it is useful to have linear algebra operations like matrix multiplication to be defined over a semiring: http://en.wikipedia.org/wiki/Logical_matrix This still does not justify having unary or binary -, so I suggest that we first discuss deprecation of those. Does it make sense to only remove - and maybe / ? would python sum still work? (I almost never use it.) sum(mask) 2 sum(mask.tolist()) 2 is accumulate the same as sum and would keep working? np.add.accumulate(mask) array([0, 0, 0, 1, 2]) In operation with other dtypes, do they still dominate so these work? x / mask array([0, 0, 0, 3, 4]) x * 1. / mask array([ nan, inf, inf, 3., 4.]) x**mask array([1, 1, 1, 3, 4]) mask - 5 array([-5, -5, -5, -4, -4]) Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
For + and * (and thus `dot`), this will fix something that is not broken. It is in fact in conformance with a large literature on boolean arrays and boolean matrices. That not everyone pays attention to this literature does not constitute a reason to break the extant, correct behavior. I'm sure I cannot be the only one who has for years taught students about Boolean matrices using NumPy, because of this correct behavior of this dtype. (By correct, I mean in conformance with the literature.) Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Thu, Dec 5, 2013 at 11:05 PM, Alan G Isaac alan.is...@gmail.com wrote: For + and * (and thus `dot`), this will fix something that is not broken. + and * are not broken - just redundant given | and . What is really broken is -, both unary and binary: int(np.bool_(0) - np.bool_(1)) 1 int(-np.bool_(0)) 1 I'm sure I cannot be the only one who has for years taught students about Boolean matrices using NumPy (I would not be so sure:-) In that experience, did you find minus to be as useful? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deprecate boolean math operators?
On Thu, Dec 5, 2013 at 11:00 PM, Alexander Belopolsky ndar...@mac.com wrote: On Thu, Dec 5, 2013 at 10:35 PM, josef.p...@gmail.com wrote: what about np.dot,np.dot(mask, x) which is the same as (mask * x).sum(0) ? I am not sure which way your argument goes, but I don't think you would find the following natural: x = array([True, True]) dot(x,x) True this is weird but I would never do that. maybe I would, but then i would add 1 non boolean (x*x).sum() 2 (x*x).sum(0) 2 That sounds right to me (mask**2 == mask).all() True (x*x).sum(False) 2 What is axis=False? The way my argument goes: I'm a heavy user of using * pretending the bool behaves like an int, and of sum and accumulate. It would be a pain to loose them. From where I come from (*) a bool is not a boolean it's just 0, 1, given that numpy casting rules apply and it's sometimes cast back to (0, 1) Does this work as explanation for the pattern of + and - also. (*) places where the type system is more restricted. What about max? np.maximum(mask, mask) array([False, False, False, True, True], dtype=bool) np.maximum(mask, ~mask) array([ True, True, True, True, True], dtype=bool) mask + mask array([False, False, False, True, True], dtype=bool) mask + ~mask array([ True, True, True, True, True], dtype=bool) first mask is if the wife has a car, second mask is if the husband has a car. The max is if there is a car in the family. What's this as logical? Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion