[Numpy-discussion] nan, sign, and all that
Hi All, I've added ufuncs fmin and fmax that behave as follows: In [3]: a = array([NAN, 0, NAN, 1]) In [4]: b = array([0, NAN, NAN, 0]) In [5]: fmax(a,b) Out[5]: array([ 0., 0., NaN, 1.]) In [6]: fmin(a,b) Out[6]: array([ 0., 0., NaN, 0.]) In [7]: fmax.reduce(a) Out[7]: 1.0 In [8]: fmin.reduce(a) Out[8]: 0.0 In [9]: fmax.reduce([NAN,NAN]) Out[9]: nan In [10]: fmin.reduce([NAN,NAN]) Out[10]: nan I also made the sign ufunc return the sign of nan. That works, but I'm not sure it is the way to go because there doesn't seem to be any spec as to what sign nan takes. The current np.nan on my machine is negative and 0/0, inf/inf all return negative nan. So it doesn't look like the actual sign of nan makes any sense. Currently sign(NAN) returns 0, which doesn't look right either, so I think the thing to do is return nan but this will be a change in numpy behavior. Any thoughts? Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nan, sign, and all that
Hi Charles, 2008/10/2 Charles R Harris [EMAIL PROTECTED]: In [3]: a = array([NAN, 0, NAN, 1]) In [4]: b = array([0, NAN, NAN, 0]) In [5]: fmax(a,b) Out[5]: array([ 0., 0., NaN, 1.]) In [6]: fmin(a,b) Out[6]: array([ 0., 0., NaN, 0.]) These are great, many thanks! My only gripe is that they have the same NaN-handling as amin and friends, which I consider to be broken. Others also mentioned that this should be changed, and I think David C wrote a patch for it (but I am not informed as to the speed implications). If I had to choose, this would be my preferred output: In [5]: fmax(a,b) Out[5]: array([ NaN, NaN, NaN, 1.]) Cheers Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nan, sign, and all that
On Thu, Oct 2, 2008 at 02:37, Stéfan van der Walt [EMAIL PROTECTED] wrote: Hi Charles, 2008/10/2 Charles R Harris [EMAIL PROTECTED]: In [3]: a = array([NAN, 0, NAN, 1]) In [4]: b = array([0, NAN, NAN, 0]) In [5]: fmax(a,b) Out[5]: array([ 0., 0., NaN, 1.]) In [6]: fmin(a,b) Out[6]: array([ 0., 0., NaN, 0.]) These are great, many thanks! My only gripe is that they have the same NaN-handling as amin and friends, which I consider to be broken. No, these follow well-defined C99 semantics of the fmin() and fmax() functions in libm. If exactly one of the arguments is a NaN, the non-NaN argument is returned. This is *not* the current behavior of amin() et al., which just do naive comparisons. Others also mentioned that this should be changed, and I think David C wrote a patch for it (but I am not informed as to the speed implications). If I had to choose, this would be my preferred output: In [5]: fmax(a,b) Out[5]: array([ NaN, NaN, NaN, 1.]) Chuck proposes letting minimum() and maximum() have that behavior. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nan, sign, and all that
2008/10/2 Robert Kern [EMAIL PROTECTED]: My only gripe is that they have the same NaN-handling as amin and friends, which I consider to be broken. No, these follow well-defined C99 semantics of the fmin() and fmax() functions in libm. If exactly one of the arguments is a NaN, the non-NaN argument is returned. This is *not* the current behavior of amin() et al., which just do naive comparisons. Let me rephrase: I'm not convinced that these C99 semantics provide an optimal user experience. It worries me greatly that NaN's pop up in operations and then disappear again. It is entirely possible for a script to run without failure and spew out garbage without the user ever knowing. Others also mentioned that this should be changed, and I think David C wrote a patch for it (but I am not informed as to the speed implications). If I had to choose, this would be my preferred output: In [5]: fmax(a,b) Out[5]: array([ NaN, NaN, NaN, 1.]) Chuck proposes letting minimum() and maximum() have that behavior. That would be a good start, which would be complemented by educating the user via some appropriate mechanism (I still don't know if one exists; there is no NumPy Paperclip TM that states You have decided to commit scientific suicide. Would you like me to cut your wrists?). That's meant only half-tongue-in-cheekedly :) Thanks for your comments, Cheers Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nan, sign, and all that
On Thu, Oct 2, 2008 at 4:37 PM, Stéfan van der Walt [EMAIL PROTECTED] wrote: These are great, many thanks! My only gripe is that they have the same NaN-handling as amin and friends, which I consider to be broken. Others also mentioned that this should be changed, and I think David C wrote a patch for it (but I am not informed as to the speed implications). Hopefully, Chuck and me synchronised a bit on this :) The idea is that before, I thought that there was a nan ignoring and nan propagating behavior. Robert later mentioned that fmin/fmax has a third, well specified behavior in C99. All those three are useful, and as such have been more or less implemented by Chuck or me. I think having the new C functions by Chuck makes sense as a new python API, to follow C99 fmax/fmin. They could be used for the new max/min, but then, it feels it a bit strange compared to nanmax/nanmin, so I would prefer having the *current* numpy.max and numpy.min propagate the NaN, and nanmax/nanmin ignoring the NaN altogether. Also note that matlab does not propagate NaN for max/min. The last question is FPU status flag handling: I thought comparing NaN directly with would throw a FPE_INVALID. But this is not the case (at least on Linux with glibc and Mac OS X). This is confusing because I thought the whole point of C99 macro isgreater was to not throw this. This is also how I understand both glibc manual and mac os x man isgreater. Robert, do you have any insight on this ? David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nan, sign, and all that
Stéfan van der Walt [EMAIL PROTECTED] writes: Let me rephrase: I'm not convinced that these C99 semantics provide an optimal user experience. It worries me greatly that NaN's pop up in operations and then disappear again. It is entirely possible for a script to run without failure and spew out garbage without the user ever knowing. By default NaNs are propagated through operations on them. At the end of this discussion we ought to end up with a list of functions such as fmax, isnan, and copysign that are the exceptions. I think that it is right to defer to IEEE for their decisions on the behavior of NaNs, etc. That is what C and Fortran are doing. I have not checked but I would guess that CPUs and FPUs behave that way too. So it should be easier and faster to follow IEEE. Note that in the just released Python 2.6 floating point support of IEEE 754 has been beefed up. -- Pete Forman-./\.- Disclaimer: This post is originated WesternGeco -./\.- by myself and does not represent [EMAIL PROTECTED]-./\.- the opinion of Schlumberger or http://petef.22web.net -./\.- WesternGeco. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nan, sign, and all that
On Thu, Oct 2, 2008 at 1:42 AM, Robert Kern [EMAIL PROTECTED] wrote: On Thu, Oct 2, 2008 at 02:37, Stéfan van der Walt [EMAIL PROTECTED] wrote: Hi Charles, 2008/10/2 Charles R Harris [EMAIL PROTECTED]: In [3]: a = array([NAN, 0, NAN, 1]) In [4]: b = array([0, NAN, NAN, 0]) In [5]: fmax(a,b) Out[5]: array([ 0., 0., NaN, 1.]) In [6]: fmin(a,b) Out[6]: array([ 0., 0., NaN, 0.]) These are great, many thanks! My only gripe is that they have the same NaN-handling as amin and friends, which I consider to be broken. No, these follow well-defined C99 semantics of the fmin() and fmax() functions in libm. If exactly one of the arguments is a NaN, the non-NaN argument is returned. This is *not* the current behavior of amin() et al., which just do naive comparisons. Others also mentioned that this should be changed, and I think David C wrote a patch for it (but I am not informed as to the speed implications). If I had to choose, this would be my preferred output: In [5]: fmax(a,b) Out[5]: array([ NaN, NaN, NaN, 1.]) Chuck proposes letting minimum() and maximum() have that behavior. Yes. If there is any agreement on this I would like to go ahead and do it. It does change the current behavior of maximum and minimum. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nan, sign, and all that
Charles R Harris wrote: Yes. If there is any agreement on this I would like to go ahead and do it. It does change the current behavior of maximum and minimum. If you do it, please do it with as many tests as possible (it should not be difficult to have a comprehensive test with *all* float data types), because this is likely to cause problems on some platforms. thanks, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nan, sign, and all that
On Thu, Oct 2, 2008 at 08:22, Charles R Harris [EMAIL PROTECTED] wrote: On Thu, Oct 2, 2008 at 1:42 AM, Robert Kern [EMAIL PROTECTED] wrote: On Thu, Oct 2, 2008 at 02:37, Stéfan van der Walt [EMAIL PROTECTED] wrote: Hi Charles, 2008/10/2 Charles R Harris [EMAIL PROTECTED]: In [3]: a = array([NAN, 0, NAN, 1]) In [4]: b = array([0, NAN, NAN, 0]) In [5]: fmax(a,b) Out[5]: array([ 0., 0., NaN, 1.]) In [6]: fmin(a,b) Out[6]: array([ 0., 0., NaN, 0.]) These are great, many thanks! My only gripe is that they have the same NaN-handling as amin and friends, which I consider to be broken. No, these follow well-defined C99 semantics of the fmin() and fmax() functions in libm. If exactly one of the arguments is a NaN, the non-NaN argument is returned. This is *not* the current behavior of amin() et al., which just do naive comparisons. Others also mentioned that this should be changed, and I think David C wrote a patch for it (but I am not informed as to the speed implications). If I had to choose, this would be my preferred output: In [5]: fmax(a,b) Out[5]: array([ NaN, NaN, NaN, 1.]) Chuck proposes letting minimum() and maximum() have that behavior. Yes. If there is any agreement on this I would like to go ahead and do it. It does change the current behavior of maximum and minimum. I think the position we've held is that in the presence of NaNs, the behavior of these functions have been left unspecified, so I think it is okay to change them. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion