I mean not having to it myself. data is a numpy array with NaN in it masked_data = numpy.ma.array(data) returns a masked array with a mask where NaN were in data
C. Bruce Southey wrote: > Charles Doutriaux wrote: > >> Hi Bruce, >> >> Thx for the reply, we're aware of this, basically the question was why >> not mask NaN automatically when creating a nump.ma array? >> >> C. >> >> Bruce Southey wrote: >> >> >>> Charles Doutriaux wrote: >>> >>> >>> >>>> Hi Stephane, >>>> >>>> This is a good suggestion, I'm ccing the numpy list on this. Because I'm >>>> wondering if it wouldn't be a better fit to do it directly at the >>>> numpy.ma level. >>>> >>>> I'm sure they already thought about this (and 'inf' values as well) and >>>> if they don't do it , there's probably some good reason we didn't think >>>> of yet. >>>> So before i go ahead and do it in MV2 I'd like to know the reason why >>>> it's not in numpy.ma, they are probably valid for MVs too. >>>> >>>> C. >>>> >>>> Stephane Raynaud wrote: >>>> >>>> >>>> >>>> >>>>> Hi, >>>>> >>>>> how about automatically (or at least optionally) masking all NaN >>>>> values when creating a MV array? >>>>> >>>>> On Thu, Jul 24, 2008 at 11:43 PM, Arthur M. Greene >>>>> <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote: >>>>> >>>>> Yup, this works. Thanks! >>>>> >>>>> I guess it's time for me to dig deeper into numpy syntax and >>>>> functions, now that CDAT is using the numpy core for array >>>>> management... >>>>> >>>>> Best, >>>>> >>>>> Arthur >>>>> >>>>> >>>>> Charles Doutriaux wrote: >>>>> >>>>> Seems right to me, >>>>> >>>>> Except that the syntax might scare a bit the new users :) >>>>> >>>>> C. >>>>> >>>>> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I'm not sure if what I am about to suggest is a good idea >>>>> or not, perhaps Charles will correct me if this is a bad >>>>> idea for any reason. >>>>> >>>>> Lets say you have a cdms variable called U with NaNs as >>>>> the missing >>>>> value. First we can replace the NaNs with 1e20: >>>>> >>>>> U.data[numpy.where(numpy.isnan(U.data))] = 1e20 >>>>> >>>>> And remember to set the missing value of the variable >>>>> appropriately: >>>>> >>>>> U.setMissing(1e20) >>>>> >>>>> I hope that helps, Andrew >>>>> >>>>> >>>>> >>>>> Hi Arthur, >>>>> >>>>> If i remember correctly the way i used to do it was: >>>>> a= MV2.greater(data,1.) b=MV2.less_equal(data,1) >>>>> c=MV2.logical_and(a,b) # Nan are the only one left >>>>> data=MV2.masked_where(c,data) >>>>> >>>>> BUT I believe numpy now has way to deal with nan I >>>>> believe it is numpy.nan_to_num But it replaces with 0 >>>>> so it may not be what you >>>>> want >>>>> >>>>> C. >>>>> >>>>> >>>>> Arthur M. Greene wrote: >>>>> >>>>> A typical netcdf file is opened, and the single >>>>> variable extracted: >>>>> >>>>> >>>>> fpr=cdms.open('prTS2p1_SEA_allmos.cdf') >>>>> pr0=fpr('prcp') type(pr0) >>>>> >>>>> <class 'cdms2.tvariable.TransientVariable'> >>>>> >>>>> Masked values (indicating ocean in this case) show >>>>> up here as NaNs. >>>>> >>>>> >>>>> pr0[0,-15:-5,0] >>>>> >>>>> prcp array([NaN NaN NaN NaN NaN NaN 0.37745094 >>>>> 0.3460784 0.21960783 0.19117641]) >>>>> >>>>> So far this is all consistent. A map of the first >>>>> time step shows the proper land-ocean boundaries, >>>>> reasonable-looking values, and so on. But there >>>>> doesn't seem to be any way to mask >>>>> this array, so, e.g., an 'xy' average can be >>>>> computed (it >>>>> comes out all nans). NaN is not equal to anything >>>>> -- even >>>>> itself -- so there does not seem to be any >>>>> condition, among the >>>>> MV.masked_xxx options, that can be applied as a >>>>> test. Also, it >>>>> does not seem possible to compute seasonal averages, >>>>> anomalies, etc. -- they also produce just NaNs. >>>>> >>>>> The workaround I've come up with -- for now -- is >>>>> to first generate a new array of identical shape, >>>>> filled with 1.0E+20. One test I've found that can >>>>> detect NaNs is numpy.isnan: >>>>> >>>>> >>>>> isnan(pr0[0,0,0]) >>>>> >>>>> True >>>>> >>>>> So it is _possible_ to tediously loop through >>>>> every value in the old array, testing with isnan, >>>>> then copying to the new array if the test fails. >>>>> Then the axes have to be reset... >>>>> >>>>> isnan does not accept array arguments, so one >>>>> cannot do, e.g., >>>>> >>>>> prmasked=MV.masked_where(isnan(pr0),pr0) >>>>> >>>>> The element-by-element conversion is quite slow. >>>>> (I'm still waiting for it to complete, in fact). >>>>> Any suggestions for dealing with NaN-infested data >>>>> objects? >>>>> >>>>> Thanks! >>>>> >>>>> AMG >>>>> >>>>> P.S. This is 5.0.0.beta, RHEL4. >>>>> >>>>> >>>>> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >>>>> Arthur M. Greene, Ph.D. >>>>> The International Research Institute for Climate and Society >>>>> The Earth Institute, Columbia University, Lamont Campus >>>>> Monell Building, 61 Route 9W, Palisades, NY 10964-8000 USA >>>>> amg*at*iri-dot-columbia\dot\edu | http:// iri.columbia.edu >>>>> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------- >>>>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>>>> challenge >>>>> Build the coolest Linux based applications with Moblin SDK & win >>>>> great prizes >>>>> Grand prize is a trip for two to an Open Source event anywhere in >>>>> the world >>>>> http:// moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>> <http:// moblin-contest.org/redirect.php?banner_id=100&url=/> >>>>> _______________________________________________ >>>>> Cdat-discussion mailing list >>>>> [EMAIL PROTECTED] >>>>> <mailto:[EMAIL PROTECTED]> >>>>> https:// lists.sourceforge.net/lists/listinfo/cdat-discussion >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Stephane Raynaud >>>>> ------------------------------------------------------------------------ >>>>> >>>>> ------------------------------------------------------------------------- >>>>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>>>> challenge >>>>> Build the coolest Linux based applications with Moblin SDK & win great >>>>> prizes >>>>> Grand prize is a trip for two to an Open Source event anywhere in the >>>>> world >>>>> http:// moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>> ------------------------------------------------------------------------ >>>>> >>>>> _______________________________________________ >>>>> Cdat-discussion mailing list >>>>> [EMAIL PROTECTED] >>>>> https:// lists.sourceforge.net/lists/listinfo/cdat-discussion >>>>> >>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> [email protected] >>>> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> >>>> >>>> >>> Please look the various NumPy functions to ignore NaN like nansum(). See >>> the NumPy example list >>> (http:// www. scipy.org/Numpy_Example_List_With_Doc) for examples under >>> nan or individual functions. >>> >>> To get the mean you can do something like: >>> >>> import numpy >>> x = numpy.array([2, numpy.nan, 1]) >>> numpy.nansum(x)/(x.shape[0]-numpy.isnan(x).sum()) >>> x_masked = numpy.ma.masked_where(numpy.isnan(x) , x) >>> x_masked.mean() >>> >>> The real advantage of masked arrays is that you have greater control >>> over the filtering so you can also filter extreme values: >>> >>> y = numpy.array([2, numpy.nan, 1, 1000]) >>> y_masked =numpy.ma.masked_where(numpy.isnan(y) , y) >>> y_masked =numpy.ma.masked_where(y_masked > 100 , y_masked) >>> y_masked.mean() >>> >>> Regards >>> Bruce >>> _______________________________________________ >>> Numpy-discussion mailing list >>> [email protected] >>> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> >>> >> _______________________________________________ >> Numpy-discussion mailing list >> [email protected] >> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > You mean like doing: > > import numpy > y=numpy.ma.MaskedArray([ 2., numpy.nan, 1., 1000.], numpy.isnan(y)) > > ? > > Bruce > > > _______________________________________________ > Numpy-discussion mailing list > [email protected] > http:// projects.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ Numpy-discussion mailing list [email protected] http://projects.scipy.org/mailman/listinfo/numpy-discussion
