On Friday, November 4, 2011, Gary Strangman <str...@nmr.mgh.harvard.edu> wrote: > >> > non-destructive+propagating -- it really depends on exactly what >> > computations you want to perform, and how you expect them to work. The >> > main difference is how reduction operations are treated. I kind of >> > feel like the non-propagating version makes more sense overall, but I >> > don't know if there's any consensus on that. >> >> I think this is further evidence for my idea that a mask should not be >> undone, but is non destructive. If you want to be able to access the values >> after masking, have a view, or only apply the mask to a view. > > OK, so my understanding of what's meant by propagating is probably incomplete (and is definitely still fuzzy). I'm a little confused by the phrase "a mask should not be undone" though. Say I want to perform a statistical analysis or filtering procedure excluding and (separately) including a handful of outliers? Isn't that a natural case for undoing a mask? Or did you mean something else? > > I think I understand the "use a view" option above, though I don't see how one could apply a mask only to a view. What if my view is every other row in a 2D array, and I want to mask the last half of this view? What is the state of the original array once the mask has been applied? > > (If this is derailing the progress of this thread, feel free to ignore it.) > > -best > Gary
Ufuncs can be broadly categorized as element-wise (binary ops like +, *, etc) as well as regular functions that return an array with a shape that matches the inputs broadcasted together. And reduction ops (sum, min, mean, etc). For element-wise, things are a bit murky for IGNORE, and I defer to Mark's NEP: https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst#id17, and it probably should be expanded and clarified in the NEP. For reduction ops, propagation means that sum([3 5 NA 6]) == NA, just like if you had a NaN in the array. Non-propagating (or skipping or ignore) would have that operation produce 14. A mean() for the propagating case would be NA, but 4.6666 for non-propagating. The part about undoing a mask is addressing the issue of when an operation produces a new array that has ignored elements in it, then those elements never were initialized with any value at all. Therefore, "unmasking" those elements and accessing their values make no sense. This and more are covered in this section of the NEP: https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst#id11 For your stated case, I would have two views of the data (or at least the original data and a view of it). For the view, I would apply the mask to hide the outliers from the filtering operation and produce a result. The first view (or the original array) sees the same data as it did before the other view took on a mask, so you can perform the filtering operation on the data and have two separate results. You can keep the masked view for subsequent calculations, and/or keep the original view, and/or create new views with new masks for other analyzes, all while keeping the original data intact. Note that I am right now speaking of views in a somewhat more abstract sense that is only loosely tied to numpy's specific behavior with respect to views right now. As for np.view() in specific, that is an implementation detail that probably shouldn't be in this thread yet, so don't hook too much onto it. Cheers! Ben Root
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion