[Numpy-discussion] x.min() depends on ordering
x.min() and x.max() depend on the ordering of the elements: x = M.matrix([[ M.nan, 2.0, 1.0]]) x.min() nan x = M.matrix([[ 1.0, 2.0, M.nan]]) x.min() 1.0 If I were to try the latter in ipython, I'd assume, great, min() ignores NaNs. But then the former would be a bug in my program. Is this related to how sort works? x = M.matrix([[ M.nan, 2.0, 1.0]]) x.sort() x matrix([[ nan,1.,2.]]) x = M.matrix([[ 1.0, 2.0, M.nan]]) x.sort() x matrix([[ 1., 2., nan]]) - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] x.min() depends on ordering
Keith Goodman wrote: x.min() and x.max() depend on the ordering of the elements: x = M.matrix([[ M.nan, 2.0, 1.0]]) x.min() nan x = M.matrix([[ 1.0, 2.0, M.nan]]) x.min() 1.0 If I were to try the latter in ipython, I'd assume, great, min() ignores NaNs. But then the former would be a bug in my program. Is this related to how sort works? Not really. sort() is a more complicated algorithm that does a number of different comparisons in an order that is difficult to determine beforehand. x.min() should just be a straight pass through all of the elements. However, the core problem is the same: a nan, a nan, a == nan are all False for any a. Barring a clever solution (at least cleverer than I feel like being immediately), the way to solve this would be to check for nans in the array and deal with them separately (or simply ignore them in the case of x.min()). However, this checking would slow down the common case that has no nans (sans nans, if you will). -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] x.min() depends on ordering
On 11/11/06, Robert Kern [EMAIL PROTECTED] wrote: Barring a clever solution (at least cleverer than I feel like being immediately), the way to solve this would be to check for nans in the array and deal with them separately (or simply ignore them in the case of x.min()). However, this checking would slow down the common case that has no nans (sans nans, if you will). I'm not one of the fans of sans nans. I'd prefer a slower min() that ignored nans. But I'm probably in the minority. How about a nanmin() function? - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] x.min() depends on ordering
On 11/11/06, Robert Kern [EMAIL PROTECTED] wrote: Keith Goodman wrote: x.min() and x.max() depend on the ordering of the elements: x = M.matrix([[ M.nan, 2.0, 1.0]]) x.min() nan x = M.matrix([[ 1.0, 2.0, M.nan]]) x.min() 1.0 If I were to try the latter in ipython, I'd assume, great, min() ignores NaNs. But then the former would be a bug in my program. Is this related to how sort works? Not really. sort() is a more complicated algorithm that does a number ofdifferent comparisons in an order that is difficult to determine beforehand.x.min() should just be a straight pass through all of the elements. However, the core problem is the same: a nan, a nan, a == nan are all False for any a.Barring a clever solution (at least cleverer than I feel like beingimmediately), the way to solve this would be to check for nans in the array and deal with them separately (or simply ignore them in the case of x.min()).However, this checking would slow down the common case that has no nans (sansnans, if you will).I think the problem is that the max and min functions use the first value in the array as the starting point. That could be fixed by using the first non-nan and returning nan if there aren't any real numbers. But it probably isn't worth the effort as the behavior becomes more complicated. A better rule of thumb is to note that comparisons involving nans are basically invalid because nans aren't comparable -- the comparison violates trichotomy. Don't really know what to do about that. Chuck - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] x.min() depends on ordering
Keith Goodman wrote: How about a nanmin() function? Already there. In [2]: nanmin? Type: function Base Class: type 'function' Namespace: Interactive File: /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy-1.0.1.dev3432-py2.5-macosx-10.4-i386.egg/numpy/lib/function_base.py Definition: nanmin(a, axis=None) Docstring: Find the minimium over the given axis, ignoring NaNs. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] x.min() depends on ordering
Robert Kern wrote: Keith Goodman wrote: x.min() and x.max() depend on the ordering of the elements: x = M.matrix([[ M.nan, 2.0, 1.0]]) x.min() nan x = M.matrix([[ 1.0, 2.0, M.nan]]) x.min() 1.0 If I were to try the latter in ipython, I'd assume, great, min() ignores NaNs. But then the former would be a bug in my program. Is this related to how sort works? Not really. sort() is a more complicated algorithm that does a number of different comparisons in an order that is difficult to determine beforehand. x.min() should just be a straight pass through all of the elements. However, the core problem is the same: a nan, a nan, a == nan are all False for any a. Barring a clever solution (at least cleverer than I feel like being immediately), the way to solve this would be to check for nans in the array and deal with them separately (or simply ignore them in the case of x.min()). However, this checking would slow down the common case that has no nans (sans nans, if you will). For ignoring NaNs, isn't is simply a matter of scanning through the array till you find the first non NaN the proceeding as normal? In the common case, this requires one extra compare (or rather is_nan) which should be negligible in most circumstances. Only when you have an array with a load of NaNs at the beginning would it be slow. One would have to decide whether to return NaN or raise an error when there were no real numbers. My preference would be to raise an error / warning when there is a nan in the array. Technically, there is no minimum value when a nan is present. I believe that this would be feasible be swapping the compare from 'a b' to '!(a = b)'. This should return NaN if any NaNs are present and I suspect the extra '!' will have minimal performance impact but it would have to be tested. Then a warning or error could be issued on the way out depending on the erstate. Arguably returning NaN is more correct than returning the smallest non NaN anyway. As for Keith Goodmans request for a NaN ignoring min function, I suggest: a[~np.isnan(a)].min() Or better yet, stop generating so many NaN's. -tim - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] x.min() depends on ordering
On 11/11/06, Robert Kern [EMAIL PROTECTED] wrote: Keith Goodman wrote: How about a nanmin() function? Already there. In [2]: nanmin? Type: function Base Class: type 'function' Namespace: Interactive File: /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy-1.0.1.dev3432-py2.5-macosx-10.4-i386.egg/numpy/lib/function_base.py Definition: nanmin(a, axis=None) Docstring: Find the minimium over the given axis, ignoring NaNs. Thank you! I was just about to write my own using Chuck's email as a guide. - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] x.min() depends on ordering
On 11/11/06, Charles R Harris [EMAIL PROTECTED] wrote: I think the problem is that the max and min functions use the first value in the array as the starting point. That could be fixed by using the first non-nan and returning nan if there aren't any real numbers. But it probably isn't worth the effort as the behavior becomes more complicated. A better rule of thumb is to note that comparisons involving nans are basically invalid because nans aren't comparable -- the comparison violates trichotomy. Don't really know what to do about that. Well, we could get simple consistent behaviour by taking inf as the initial value for min and -inf as the first value for max, then reducing as normal. This would then, depending on how max and min are implemented, either return NaN if any are present, or return the smallest/largest non-NaN value (or inf/-inf if there are none) A. M. Archibald - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] x.min() depends on ordering
On 11/11/06, Tim Hochberg [EMAIL PROTECTED] wrote: Robert Kern wrote:snip My preference would be to raise an error / warning when there is a nan in the array. Technically, there is no minimum value when a nan ispresent. I believe that this would be feasible be swapping the comparefrom 'a b' to '!(a = b)'. This should return NaN if any NaNs are present and I suspect the extra '!' will have minimal performance impactbut it would have to be tested. Then a warning or error could be issuedon the way out depending on the erstate. Arguably returning NaN is more correct than returning the smallest non NaN anyway.No telling what compiler optimizations might do with '!(a = b)' if they assume that '!(a = b)' == 'a b'. For instance, if !(a = b) do something;else do otherwise;might branch to the second statement on 'a b' and fall through to the first otherwise.Chuck - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] x.min() depends on ordering
On 11/11/06, Charles R Harris [EMAIL PROTECTED] wrote: On 11/11/06, Tim Hochberg [EMAIL PROTECTED] wrote: Robert Kern wrote:snip My preference would be to raise an error / warning when there is a nan in the array. Technically, there is no minimum value when a nan ispresent. I believe that this would be feasible be swapping the comparefrom 'a b' to '!(a = b)'. This should return NaN if any NaNs are present and I suspect the extra '!' will have minimal performance impactbut it would have to be tested. Then a warning or error could be issuedon the way out depending on the erstate. Arguably returning NaN is more correct than returning the smallest non NaN anyway.No telling what compiler optimizations might do with '!(a = b)' if they assume that '!(a = b)' == 'a b'. For instance, if !(a = b) do something;else do otherwise;This made me curious. Here is the codeint test(double a, double b) { if (a b) return 0; return 1;}Here is the relevant part of the assembly code when compiled with no optimizations fucompp fnstsw %ax sahf ja .L4 jmp .L2.L4: movl $0, -20(%ebp) jmp .L5 .L2: movl $1, -20(%ebp) .L5: movl -20(%ebp), %eax leave retWhich jumps to the right place on a b (ja) Here is the relevant part of the assembly code when compiled with -O3 fucompp fnstsw %ax popl %ebp sahf setbe %al movzbl %al, %eax retWhich sets the value of the return to the logical value of a = b (setbe), which won't work right with NaNs. Maybe the compiler needs another flag to deal with the possibility of NaN's because the generated code is actually incorrect. Or maybe I just discovered a compiler bug. But boy, that compiler is awesome clever. Those optimizers are getting better all the time. Chuck - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] x.min() depends on ordering
Robert Kern wrote: Keith Goodman wrote: How about a nanmin() function? Already there. In [2]: nanmin? Type: function Base Class: type 'function' Namespace: Interactive File: /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy-1.0.1.dev3432-py2.5-macosx-10.4-i386.egg/numpy/lib/function_base.py Definition: nanmin(a, axis=None) Docstring: Find the minimium over the given axis, ignoring NaNs. Bah! That seems excessive. If I was king of the world one of the many things that I would stuff into a submodule is all the stuff relating to special values (nan, inf, isnan, isinf, namin, nanmax, nansum, nanargmax, nanargmin, nan_to_num, infty, isneginf, isposinf and probably some others that I'm missing). First, these are mostly clutter in the base namespace. Not only would it make the main namespace easier to navigate in (although there's much clean up that would have to be done before it would be anything approaching easy to navigate). Second, for those who actually do need them, they'd be easier to find if they were all grouped together -- Keith for example would almost certainly have immediately found nanmin. Third, and this is perhaps a matter of opinion, there seems to be a sudden urge to abuse NaNs. Perhaps if they were shunted a bit off to the side, this temptation would be lifted. Curmudgeonly yours, -tim - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] x.min() depends on ordering
On 11/11/06, Tim Hochberg [EMAIL PROTECTED] wrote: Robert Kern wrote: Keith Goodman wrote: How about a nanmin() function? Already there. In [2]: nanmin? Type: function Base Class: type 'function' Namespace:Interactive File: /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy-1.0.1.dev3432-py2.5-macosx-10.4-i386.egg/numpy/lib/function_base.py Definition: nanmin(a, axis=None) Docstring: Find the minimium over the given axis, ignoring NaNs.Bah! That seems excessive. If I was king of the world one of the manythings that I would stuff into a submodule is all the stuff relating to special values (nan, inf, isnan, isinf, namin, nanmax, nansum,nanargmax, nanargmin, nan_to_num, infty,isneginf, isposinf andprobably some others that I'm missing). First, these are mostly clutterin the base namespace. Not only would it make the main namespace easier to navigate in (although there's much clean up that would have to bedone before it would be anything approaching easy to navigate). Second,for those who actually do need them, they'd be easier to find if they were all grouped together -- Keith for example would almost certainlyhave immediately found nanmin. Third, and this is perhaps a matter ofopinion, there seems to be a sudden urge to abuse NaNs. Perhaps if they were shunted a bit off to the side, this temptation would be lifted.Curmudgeonly yours,Oh yes. And let's reserve a bit of abuse for whoever mixed up the nans with the rest. I mean, the infs and such actually make some sense as numbers, but nans are, well, nans. So it would have been nice to enable everything *except* nans, and have an errorflag set whenever the latter turned up. Or if folks just have to have nans, make them compare less than anything else. If isnan were fast and easy one could use LT(a,b) := isnan(a) || a b. Chuck - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] x.min() depends on ordering
Charles R Harris wrote: On 11/11/06, *Tim Hochberg* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Robert Kern wrote: Keith Goodman wrote: How about a nanmin() function? Already there. In [2]: nanmin? Type: function Base Class: type 'function' Namespace: Interactive File: /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy-1.0.1.dev3432-py2.5-macosx-10.4-i386.egg/numpy/lib/function_base.py Definition: nanmin(a, axis=None) Docstring: Find the minimium over the given axis, ignoring NaNs. Bah! That seems excessive. If I was king of the world one of the many things that I would stuff into a submodule is all the stuff relating to special values (nan, inf, isnan, isinf, namin, nanmax, nansum, nanargmax, nanargmin, nan_to_num, infty, isneginf, isposinf and probably some others that I'm missing). First, these are mostly clutter in the base namespace. Not only would it make the main namespace easier to navigate in (although there's much clean up that would have to be done before it would be anything approaching easy to navigate). Second, for those who actually do need them, they'd be easier to find if they were all grouped together -- Keith for example would almost certainly have immediately found nanmin. Third, and this is perhaps a matter of opinion, there seems to be a sudden urge to abuse NaNs. Perhaps if they were shunted a bit off to the side, this temptation would be lifted. Curmudgeonly yours, Oh yes. And let's reserve a bit of abuse for whoever mixed up the nans with the rest. I mean, the infs and such actually make some sense as numbers, but nans are, well, nans. So it would have been nice to enable everything *except* nans, and have an errorflag set whenever the latter turned up. Actually you can do this. At least for the most part, I'm sure there are some corners that aren't working right yet. Operations on NaNs set the invalid flag while infinities set the overflow flag (actually the flags are only set when you first get the infinity or NaN, at least on my platform). So, you can make invalid a flag and ignore overflow by using: np.seterr(over='ignore', invalid='raise') I doubt this does *exactly* what you want, but it may be close. -tim [snip] - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion