Charles Doutriaux wrote: > Hi Stephane, > > This is a good suggestion, I'm ccing the numpy list on this. Because I'm > wondering if it wouldn't be a better fit to do it directly at the > numpy.ma level. > > I'm sure they already thought about this (and 'inf' values as well) and > if they don't do it , there's probably some good reason we didn't think > of yet. > So before i go ahead and do it in MV2 I'd like to know the reason why > it's not in numpy.ma, they are probably valid for MVs too. > > C. > > Stephane Raynaud wrote: > >> Hi, >> >> how about automatically (or at least optionally) masking all NaN >> values when creating a MV array? >> >> On Thu, Jul 24, 2008 at 11:43 PM, Arthur M. Greene >> <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote: >> >> Yup, this works. Thanks! >> >> I guess it's time for me to dig deeper into numpy syntax and >> functions, now that CDAT is using the numpy core for array >> management... >> >> Best, >> >> Arthur >> >> >> Charles Doutriaux wrote: >> >> Seems right to me, >> >> Except that the syntax might scare a bit the new users :) >> >> C. >> >> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> wrote: >> >> Hi, >> >> I'm not sure if what I am about to suggest is a good idea >> or not, perhaps Charles will correct me if this is a bad >> idea for any reason. >> >> Lets say you have a cdms variable called U with NaNs as >> the missing >> value. First we can replace the NaNs with 1e20: >> >> U.data[numpy.where(numpy.isnan(U.data))] = 1e20 >> >> And remember to set the missing value of the variable >> appropriately: >> >> U.setMissing(1e20) >> >> I hope that helps, Andrew >> >> >> >> Hi Arthur, >> >> If i remember correctly the way i used to do it was: >> a= MV2.greater(data,1.) b=MV2.less_equal(data,1) >> c=MV2.logical_and(a,b) # Nan are the only one left >> data=MV2.masked_where(c,data) >> >> BUT I believe numpy now has way to deal with nan I >> believe it is numpy.nan_to_num But it replaces with 0 >> so it may not be what you >> want >> >> C. >> >> >> Arthur M. Greene wrote: >> >> A typical netcdf file is opened, and the single >> variable extracted: >> >> >> fpr=cdms.open('prTS2p1_SEA_allmos.cdf') >> pr0=fpr('prcp') type(pr0) >> >> <class 'cdms2.tvariable.TransientVariable'> >> >> Masked values (indicating ocean in this case) show >> up here as NaNs. >> >> >> pr0[0,-15:-5,0] >> >> prcp array([NaN NaN NaN NaN NaN NaN 0.37745094 >> 0.3460784 0.21960783 0.19117641]) >> >> So far this is all consistent. A map of the first >> time step shows the proper land-ocean boundaries, >> reasonable-looking values, and so on. But there >> doesn't seem to be any way to mask >> this array, so, e.g., an 'xy' average can be >> computed (it >> comes out all nans). NaN is not equal to anything >> -- even >> itself -- so there does not seem to be any >> condition, among the >> MV.masked_xxx options, that can be applied as a >> test. Also, it >> does not seem possible to compute seasonal averages, >> anomalies, etc. -- they also produce just NaNs. >> >> The workaround I've come up with -- for now -- is >> to first generate a new array of identical shape, >> filled with 1.0E+20. One test I've found that can >> detect NaNs is numpy.isnan: >> >> >> isnan(pr0[0,0,0]) >> >> True >> >> So it is _possible_ to tediously loop through >> every value in the old array, testing with isnan, >> then copying to the new array if the test fails. >> Then the axes have to be reset... >> >> isnan does not accept array arguments, so one >> cannot do, e.g., >> >> prmasked=MV.masked_where(isnan(pr0),pr0) >> >> The element-by-element conversion is quite slow. >> (I'm still waiting for it to complete, in fact). >> Any suggestions for dealing with NaN-infested data >> objects? >> >> Thanks! >> >> AMG >> >> P.S. This is 5.0.0.beta, RHEL4. >> >> >> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >> Arthur M. Greene, Ph.D. >> The International Research Institute for Climate and Society >> The Earth Institute, Columbia University, Lamont Campus >> Monell Building, 61 Route 9W, Palisades, NY 10964-8000 USA >> amg*at*iri-dot-columbia\dot\edu | http://iri.columbia.edu >> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >> >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's >> challenge >> Build the coolest Linux based applications with Moblin SDK & win >> great prizes >> Grand prize is a trip for two to an Open Source event anywhere in >> the world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> <http://moblin-contest.org/redirect.php?banner_id=100&url=/> >> _______________________________________________ >> Cdat-discussion mailing list >> [EMAIL PROTECTED] >> <mailto:[EMAIL PROTECTED]> >> https://lists.sourceforge.net/lists/listinfo/cdat-discussion >> >> >> >> >> -- >> Stephane Raynaud >> ------------------------------------------------------------------------ >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge >> Build the coolest Linux based applications with Moblin SDK & win great prizes >> Grand prize is a trip for two to an Open Source event anywhere in the world >> http:// moblin-contest.org/redirect.php?banner_id=100&url=/ >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Cdat-discussion mailing list >> [EMAIL PROTECTED] >> https:// lists.sourceforge.net/lists/listinfo/cdat-discussion >> >> > > _______________________________________________ > Numpy-discussion mailing list > [email protected] > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > Please look the various NumPy functions to ignore NaN like nansum(). See the NumPy example list (http://www.scipy.org/Numpy_Example_List_With_Doc) for examples under nan or individual functions.
To get the mean you can do something like: import numpy x = numpy.array([2, numpy.nan, 1]) numpy.nansum(x)/(x.shape[0]-numpy.isnan(x).sum()) x_masked = numpy.ma.masked_where(numpy.isnan(x) , x) x_masked.mean() The real advantage of masked arrays is that you have greater control over the filtering so you can also filter extreme values: y = numpy.array([2, numpy.nan, 1, 1000]) y_masked =numpy.ma.masked_where(numpy.isnan(y) , y) y_masked =numpy.ma.masked_where(y_masked > 100 , y_masked) y_masked.mean() Regards Bruce _______________________________________________ Numpy-discussion mailing list [email protected] http://projects.scipy.org/mailman/listinfo/numpy-discussion
