[Numpy-discussion] nan, sign, and all that

2008-10-02 Thread Charles R Harris
Hi All,

I've added ufuncs fmin and fmax that behave as follows:

In [3]: a = array([NAN, 0, NAN, 1])

In [4]: b = array([0, NAN, NAN, 0])

In [5]: fmax(a,b)
Out[5]: array([  0.,   0.,  NaN,   1.])

In [6]: fmin(a,b)
Out[6]: array([  0.,   0.,  NaN,   0.])

In [7]: fmax.reduce(a)
Out[7]: 1.0

In [8]: fmin.reduce(a)
Out[8]: 0.0

In [9]: fmax.reduce([NAN,NAN])
Out[9]: nan

In [10]: fmin.reduce([NAN,NAN])
Out[10]: nan

I also made the sign ufunc return the sign of nan. That works, but I'm not
sure it is the way to go because there doesn't seem to be any spec as to
what sign nan takes. The current np.nan on my machine is negative and 0/0,
inf/inf all return negative nan. So it doesn't look like the actual sign of
nan makes any sense. Currently sign(NAN) returns 0, which doesn't look right
either, so I think the thing to do is return nan but this will be a change
in numpy behavior. Any thoughts?

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nan, sign, and all that

2008-10-02 Thread Stéfan van der Walt
Hi Charles,

2008/10/2 Charles R Harris [EMAIL PROTECTED]:
 In [3]: a = array([NAN, 0, NAN, 1])
 In [4]: b = array([0, NAN, NAN, 0])

 In [5]: fmax(a,b)
 Out[5]: array([  0.,   0.,  NaN,   1.])

 In [6]: fmin(a,b)
 Out[6]: array([  0.,   0.,  NaN,   0.])

These are great, many thanks!

My only gripe is that they have the same NaN-handling as amin and
friends, which I consider to be broken.  Others also mentioned that
this should be changed, and I think David C wrote a patch for it (but
I am not informed as to the speed implications).

If I had to choose, this would be my preferred output:

In [5]: fmax(a,b)
Out[5]: array([  NaN,   NaN,  NaN,   1.])

Cheers
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nan, sign, and all that

2008-10-02 Thread Robert Kern
On Thu, Oct 2, 2008 at 02:37, Stéfan van der Walt [EMAIL PROTECTED] wrote:
 Hi Charles,

 2008/10/2 Charles R Harris [EMAIL PROTECTED]:
 In [3]: a = array([NAN, 0, NAN, 1])
 In [4]: b = array([0, NAN, NAN, 0])

 In [5]: fmax(a,b)
 Out[5]: array([  0.,   0.,  NaN,   1.])

 In [6]: fmin(a,b)
 Out[6]: array([  0.,   0.,  NaN,   0.])

 These are great, many thanks!

 My only gripe is that they have the same NaN-handling as amin and
 friends, which I consider to be broken.

No, these follow well-defined C99 semantics of the fmin() and fmax()
functions in libm. If exactly one of the arguments is a NaN, the
non-NaN argument is returned. This is *not* the current behavior of
amin() et al., which just do naive comparisons.

 Others also mentioned that
 this should be changed, and I think David C wrote a patch for it (but
 I am not informed as to the speed implications).

 If I had to choose, this would be my preferred output:

 In [5]: fmax(a,b)
 Out[5]: array([  NaN,   NaN,  NaN,   1.])

Chuck proposes letting minimum() and maximum() have that behavior.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nan, sign, and all that

2008-10-02 Thread Stéfan van der Walt
2008/10/2 Robert Kern [EMAIL PROTECTED]:
 My only gripe is that they have the same NaN-handling as amin and
 friends, which I consider to be broken.

 No, these follow well-defined C99 semantics of the fmin() and fmax()
 functions in libm. If exactly one of the arguments is a NaN, the
 non-NaN argument is returned. This is *not* the current behavior of
 amin() et al., which just do naive comparisons.

Let me rephrase: I'm not convinced that these C99 semantics provide an
optimal user experience.  It worries me greatly that NaN's pop up in
operations and then disappear again.  It is entirely possible for a
script to run without failure and spew out garbage without the user
ever knowing.

 Others also mentioned that
 this should be changed, and I think David C wrote a patch for it (but
 I am not informed as to the speed implications).

 If I had to choose, this would be my preferred output:

 In [5]: fmax(a,b)
 Out[5]: array([  NaN,   NaN,  NaN,   1.])

 Chuck proposes letting minimum() and maximum() have that behavior.

That would be a good start, which would be complemented by educating
the user via some appropriate mechanism (I still don't know if one
exists; there is no NumPy Paperclip TM that states You have decided
to commit scientific suicide.  Would you like me to cut your
wrists?).  That's meant only half-tongue-in-cheekedly :)

Thanks for your comments,

Cheers
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nan, sign, and all that

2008-10-02 Thread David Cournapeau
On Thu, Oct 2, 2008 at 4:37 PM, Stéfan van der Walt [EMAIL PROTECTED] wrote:

 These are great, many thanks!

 My only gripe is that they have the same NaN-handling as amin and
 friends, which I consider to be broken.  Others also mentioned that
 this should be changed, and I think David C wrote a patch for it (but
 I am not informed as to the speed implications).

Hopefully, Chuck and me synchronised a bit on this :) The idea is that
before, I thought that there was a nan ignoring and nan propagating
behavior. Robert later mentioned that fmin/fmax has a third, well
specified behavior in C99. All those three are useful, and as such
have been more or less implemented by Chuck or me.

I think having the new C functions by Chuck makes sense as a new
python API, to follow C99 fmax/fmin. They could be used for the new
max/min, but then, it feels it a bit strange compared to
nanmax/nanmin, so I would prefer having the *current* numpy.max and
numpy.min propagate the NaN, and nanmax/nanmin ignoring the NaN
altogether.

Also note that matlab does not propagate NaN for max/min.

The last question is FPU status flag handling: I thought comparing NaN
directly with  would throw a FPE_INVALID. But this is not the case
(at least on Linux with glibc and Mac OS X). This is confusing because
I thought the whole point of C99 macro isgreater was to not throw
this. This is also how I understand both glibc manual and mac os x man
isgreater. Robert, do you have any insight on this ?

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nan, sign, and all that

2008-10-02 Thread Pete Forman
Stéfan van der Walt [EMAIL PROTECTED] writes:

  Let me rephrase: I'm not convinced that these C99 semantics provide
  an optimal user experience.  It worries me greatly that NaN's pop
  up in operations and then disappear again.  It is entirely possible
  for a script to run without failure and spew out garbage without
  the user ever knowing.

By default NaNs are propagated through operations on them.  At the end
of this discussion we ought to end up with a list of functions such as
fmax, isnan, and copysign that are the exceptions.

I think that it is right to defer to IEEE for their decisions on the
behavior of NaNs, etc.  That is what C and Fortran are doing.  I have
not checked but I would guess that CPUs and FPUs behave that way too.
So it should be easier and faster to follow IEEE.

Note that in the just released Python 2.6 floating point support of
IEEE 754 has been beefed up.
-- 
Pete Forman-./\.-  Disclaimer: This post is originated
WesternGeco  -./\.-   by myself and does not represent
[EMAIL PROTECTED]-./\.-   the opinion of Schlumberger or
http://petef.22web.net   -./\.-   WesternGeco.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nan, sign, and all that

2008-10-02 Thread Charles R Harris
On Thu, Oct 2, 2008 at 1:42 AM, Robert Kern [EMAIL PROTECTED] wrote:

 On Thu, Oct 2, 2008 at 02:37, Stéfan van der Walt [EMAIL PROTECTED]
 wrote:
  Hi Charles,
 
  2008/10/2 Charles R Harris [EMAIL PROTECTED]:
  In [3]: a = array([NAN, 0, NAN, 1])
  In [4]: b = array([0, NAN, NAN, 0])
 
  In [5]: fmax(a,b)
  Out[5]: array([  0.,   0.,  NaN,   1.])
 
  In [6]: fmin(a,b)
  Out[6]: array([  0.,   0.,  NaN,   0.])
 
  These are great, many thanks!
 
  My only gripe is that they have the same NaN-handling as amin and
  friends, which I consider to be broken.

 No, these follow well-defined C99 semantics of the fmin() and fmax()
 functions in libm. If exactly one of the arguments is a NaN, the
 non-NaN argument is returned. This is *not* the current behavior of
 amin() et al., which just do naive comparisons.

  Others also mentioned that
  this should be changed, and I think David C wrote a patch for it (but
  I am not informed as to the speed implications).
 
  If I had to choose, this would be my preferred output:
 
  In [5]: fmax(a,b)
  Out[5]: array([  NaN,   NaN,  NaN,   1.])

 Chuck proposes letting minimum() and maximum() have that behavior.


Yes. If there is any agreement on this I would like to go ahead and do it.
It does change the current behavior of maximum and minimum.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nan, sign, and all that

2008-10-02 Thread David Cournapeau
Charles R Harris wrote:

 Yes. If there is any agreement on this I would like to go ahead and do
 it. It does change the current behavior of maximum and minimum.

If you do it, please do it with as many tests as possible (it should not
be difficult to have a comprehensive test with *all* float data types),
because this is likely to cause problems on some platforms.

thanks,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nan, sign, and all that

2008-10-02 Thread Robert Kern
On Thu, Oct 2, 2008 at 08:22, Charles R Harris
[EMAIL PROTECTED] wrote:

 On Thu, Oct 2, 2008 at 1:42 AM, Robert Kern [EMAIL PROTECTED] wrote:

 On Thu, Oct 2, 2008 at 02:37, Stéfan van der Walt [EMAIL PROTECTED]
 wrote:
  Hi Charles,
 
  2008/10/2 Charles R Harris [EMAIL PROTECTED]:
  In [3]: a = array([NAN, 0, NAN, 1])
  In [4]: b = array([0, NAN, NAN, 0])
 
  In [5]: fmax(a,b)
  Out[5]: array([  0.,   0.,  NaN,   1.])
 
  In [6]: fmin(a,b)
  Out[6]: array([  0.,   0.,  NaN,   0.])
 
  These are great, many thanks!
 
  My only gripe is that they have the same NaN-handling as amin and
  friends, which I consider to be broken.

 No, these follow well-defined C99 semantics of the fmin() and fmax()
 functions in libm. If exactly one of the arguments is a NaN, the
 non-NaN argument is returned. This is *not* the current behavior of
 amin() et al., which just do naive comparisons.

  Others also mentioned that
  this should be changed, and I think David C wrote a patch for it (but
  I am not informed as to the speed implications).
 
  If I had to choose, this would be my preferred output:
 
  In [5]: fmax(a,b)
  Out[5]: array([  NaN,   NaN,  NaN,   1.])

 Chuck proposes letting minimum() and maximum() have that behavior.

 Yes. If there is any agreement on this I would like to go ahead and do it.
 It does change the current behavior of maximum and minimum.

I think the position we've held is that in the presence of NaNs, the
behavior of these functions have been left unspecified, so I think it
is okay to change them.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion