On Fri, Nov 4, 2011 at 1:22 PM, T J <tjhn...@gmail.com> wrote: > I agree that it would be ideal if the default were to skip IGNORED values, > but that behavior seems inconsistent with its propagation properties (such > as when adding arrays with IGNORED values). To illustrate, when we did > "x+2", we were stating that: > > IGNORED(2) + 2 == IGNORED(4) > > which means that we propagated the IGNORED value. If we were to skip them > by default, then we'd have: > > IGNORED(2) + 2 == 2 > > To be consistent, then it seems we also should have had: > >>>> x + 2 > [3, 2, 5] > > which I think we can agree is not so desirable. What this seems to come > down to is that we tend to want different behavior when we are doing > reductions, and that for IGNORED data, we want it to propagate in every > situation except for a reduction (where we want to skip over it). > > I don't know if there is a well-defined way to distinguish reductions from > the other operations. Would it hold for generalized ufuncs? Would it hold > for other functions which might return arrays instead of scalars?
Continuing my theme of looking for consensus first... there are obviously a ton of ugly corners in here. But my impression is that at least for some simple cases, it's clear what users want: >>> a = [1, IGNORED(2), 3] # array-with-ignored-values + unignored scalar only affects unignored values >>> a + 2 [3, IGNORED(2), 5] # reduction operations skip ignored values >>> np.sum(a) 4 For example, Gary mentioned the common idiom of wanting to take an array and subtract off its mean, and he wants to do that while leaving the masked-out/ignored values unchanged. As long as the above cases work the way I wrote, we will have >>> np.mean(a) 2 >>> a -= np.mean(a) >>> a [-1, IGNORED(2), 1] Which I'm pretty sure is the result that he wants. (Gary, is that right?) Also numpy.ma follows these rules, so that's some additional evidence that they're reasonable. (And I think part of the confusion between Lluís and me was that these are the rules that I meant when I said "non-propagating", but he understood that to mean something else.) So before we start exploring the whole vast space of possible ways to handle masked-out data, does anyone see any reason to consider rules that don't have, as a subset, the ones above? Do other rules have any use cases or user demand? (I *love* playing with clever mathematics and making things consistent, but there's not much point unless the end result is something that people will use :-).) -- Nathaniel _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion