Re: [Numpy-discussion] bug with with fill_values in masked arrays?
Am Dienstag, 25. März 2008 15:33:58 schrieb Chris Withers: Because in your particular case, you're inspecting elements one by one, and then, your masked data becomes the masked singleton which is a special value. I'd argue that the masked singleton having a different fill value to the ma it comes from is a bug. Note that there's no ma it comes from. It's a singleton. A special value. And your suggestion with isinstance would surely be less efficient than the current solution, since using the is operator for identity checking is as efficient as it gets. Just ignore the fill_value, which is only there for technical reason; it's unused in any case. Thanks to this discussion, I finally got an impression of ma. -- Ciao, / / /--/ / / ANS ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug with with fill_values in masked arrays?
Matt Knox wrote: data = [1., 2., 3., np.nan, 5., 6.] mask = [0, 0, 0, 1, 0, 0] I'm creating the ma with ma.masked_where... marr = ma.array(data, mask=mask) marr.set_fill_value(55) print marr[0] is ma.masked # False print marr[3] # ma.masked constant Yeah, and this is where I have the problem. The masked constant has a fill value of 9, rather than 55. That is annoying. filled_arr = marr.filled() print filled_arr # nan value is replaced with fill value of 55 Right, and this is how I currently work around the problem. cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug with with fill_values in masked arrays?
Pierre GM wrote: My bad, I neglected an overall doc for the functions and their docstring. But you know what ? As you're now at an intermediary level, That's pretty unkind to your userbase. I know a lot about python, but I'm a total novice with numpy and even the maths it's based on. help: just write down the problems you encountered, and the solutions you came up with, so that we could use your experience as the backbone for a proper MaskedArray documentation Blind leading the blind seems like a terrible idea to me... Try that: x = numpy.ma.array([0,1,2,3,]) x[-1] = numpy.nan print x [0 1 2 0] See? No NaNs with an int array. Right. Array types and whatever a dtype is are things that could be much better documented too :-( Well, no problem, they should stick around. Note that if a NaN/Inf should normally show up as the result of some operation (divide by zero for example), it'll probably won't: x = numpy.ma.array([0,1,2,numpy.nan],dtype=float) print 1./x [-- 1.0 0.5 nan] NaN/inf is still NaN in my books, so why would I be surprised by this? I'd argue that the masked singleton having a different fill value to the ma it comes from is a bug. It's not a bug, it's a featureTM One which sucks and is unintuitive. The fill_value for the mask singleton is meaningless, correct. However, having numpy.ma.masked as a constant is really helpful to test whether a particular value is masked, or to mask a particular value: x = numpy.ma.array([0,1,2,3]) x[-1] = masked x[-1] is masked True I may not know much about maths, but I know about these funny things in python we have called classes to solve exactly this problem ;-) x[-1] = Masked(fill_value=50) isinstance(x[-1],Masked) True ...which gives you what you want without forcing me to experience the resultant suck. cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug with with fill_values in masked arrays?
On Wednesday 26 March 2008 15:42:41 Chris Withers wrote: Pierre GM wrote: My bad, I neglected an overall doc for the functions and their docstring. But you know what ? As you're now at an intermediary level, That's pretty unkind to your userbase. I know a lot about python, but I'm a total novice with numpy and even the maths it's based on. My bosses have different priorities and keep on recalling me that spending time writing Python code is not what I was hired to do, and that should be writing scientific papers by the dozen. Let's say that I'm just playing middle ground to the best of my capacities. And time. help: just write down the problems you encountered, and the solutions you came up with, so that we could use your experience as the backbone for a proper MaskedArray documentation Blind leading the blind seems like a terrible idea to me... You're no longer a complete neophyte, so you're not that blind, but are still experiencing the tough part of the learning curve. I took things for granted nowadays (for example, dtypes) that are not obvious for the absolute beginners, that's exactly where you can play your role: remind me what it is to be blind so that I can help you more, start some simple doc pages on the wiki that the community can edit/append. NaN/inf is still NaN in my books, so why would I be surprised by this? Because with a regular ndarray with no NaNs initially, you could end up with NaNs and Infs with some operations. With MaskedArray, you don't. I'd argue that the masked singleton having a different fill value to the ma it comes from is a bug. It's not a bug, it's a featureTM One which sucks and is unintuitive. I can understand the unintuitive part to a certain extent, I won't comment on the first aspect however, you know, tastes, colors, snails, oysters, that kind of thing. On top of that, I could kick into touch and say that it's needed for backwards compatibility. x[-1] = Masked(fill_value=50) isinstance(x[-1],Masked) True ...which gives you what you want without forcing me to experience the resultant suck. Yeah, that's a possibility. Feel free to implement it so that we can compare the two approaches. I still don understand why you really need to have a particular fill_value for the masked constant anyway: what are you trying to do exactly ? ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug with with fill_values in masked arrays?
Pierre GM wrote: Well, yeah, my bad, that depends on whether you use masked_invalid or fix_invalid or just build a basic masked array. Yeah, well, if there were any docs I'd have a *clue* what you were talking about ;-) y=ma.fix_invalid(x) I've never done this ;-) Having NaNs in an array usually reduces performance: the option we follow w/ fix_invalid is to clear the masked array of the NaNs, and keeping track of where they were by setting the mask to True at the appropriate location. That's good to know That way, you don't have the drop of performance of having NaNs in your underlying array. Oh, and NaNs will be transformed to 0 if you use ints... use ints in what context? Nope, the idea is really is to make things as efficient as possible. For you, maybe. And for me, yes, except I wanted the NaNs to stick around... y=ma.masked_invalid(x) I'm not using masked_invalid. I didn't even know it existed. Because in your particular case, you're inspecting elements one by one, and then, your masked data becomes the masked singleton which is a special value. I'd argue that the masked singleton having a different fill value to the ma it comes from is a bug. And once again, it's not. numpy.ma.masked is a special value, like numpy.nan or numpy.inf ...which is silly, since that forces it to have a fixed fill value, which it should not. cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug with with fill_values in masked arrays?
Pierre GM wrote: This sucks to the point of feeling like a bug :-( It is not. Ignoring the fill value of masked array feels like a bug to me... Why is it desirable for it to behave like this? Because that way, you can compare anything to masked and see whether a value is masked or not. Anyway, in your case, it's just mean your value is masked. You don't care about the filling_value for this one. Where I cared was when trying to do a filled line plot in matplotlib and the nans, rather than being omitted, were being shown on the y-axis at 99, totally wrecking the plot. I'll buy your argument *iff* the masked arrays used the fill value from the parent ma. cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug with with fill_values in masked arrays?
Pierre GM wrote: On Wednesday 19 March 2008 19:47:37 Matt Knox wrote: 1. why am I not getting my NaN's back? Because they're gone when you create your masked array. Really? At least one other post has disagreed with that. And it does seem odd that a value, even if it's a nan, would be destroyed... The idea here is to get rid of the nan in your data No, it's to mask them, otherwise I would have used a normal array, not a ma. to avoid potential problems while keeping track of where the nans were in the first place. ...like plotting them on a graph, which the current behaviour makes unworkable, that you end up doing a myarray.filled(0) to get around it, with imperfect results. So, the .data part of your masked array should be nan-free, Why? Surely that should be the source data, of which nan is a valid part? and the mask tells you where the nans were. Right, but why when the masked array is cast back to a list of numbers if the fill_value of the ma not respected? 2. why is the wrong fill value being used here? the second element in the array iteration here is actually the numpy.ma.masked constant, which always has the same fill value... ...and that's a bug. cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug with with fill_values in masked arrays?
On Friday 21 March 2008 12:52:45 Chris Withers wrote: Pierre GM wrote: This sucks to the point of feeling like a bug :-( It is not. Ignoring the fill value of masked array feels like a bug to me... You're right with masked arrays, but here we're talking the masked singleton, a special value. Where I cared was when trying to do a filled line plot in matplotlib and the nans, rather than being omitted, were being shown on the y-axis at 99, totally wrecking the plot. You're losing me there. Send a simple example/script so that I can have a better idea of what you're trying to do. I'll buy your argument *iff* the masked arrays used the fill value from the parent ma. What parent ma ? ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug with with fill_values in masked arrays?
Matt Knox wrote: 1. why am I not getting my NaN's back? when iterating over a masked array, you get the ma.masked constant for elements that were masked (same as what you would get if you indexed the masked array at that element). If you are referring specifically to the .data portion of the array... it looks like the latest version of the numpy.ma sub-module preserves nan's in the data portion of the masked array, but the old version perhaps doesn't based on the output you are showing. OK, when's this going to make it into a release? 2. why is the wrong fill value being used here? the second element in the array iteration here is actually the numpy.ma.masked constant, which always has the same fill value (which I guess is 99). This sucks to the point of feeling like a bug :-( Why is it desirable for it to behave like this? cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] bug with with fill_values in masked arrays?
OK, my specific problem with masked arrays is as follows: a = numpy.array([1,numpy.nan,2]) aa = numpy.ma.masked_where(numpy.isnan(a),a) aa array(data = [ 1.e+00 1.e+20 2.e+00], mask = [False True False], fill_value=1e+020) numpy.ma.set_fill_value(aa,0) aa array(data = [ 1. 0. 2.], mask = [False True False], fill_value=0) OK, so this looks like I want it to, however: [v for v in aa] [1.0, array(data = 99, mask = True, fill_value=99) , 2.0] Two questions: 1. why am I not getting my NaN's back? 2. why is the wrong fill value being used here? cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion