On Thu, Jun 30, 2011 at 11:46 AM, Lluís <[email protected]> wrote: > Ok, I think it's time to step back and reformulate the problem by > completely ignoring the implementation. > > Here we have 2 "generic" concepts (i.e., applicable to R), plus another > extra concept that is exclusive to numpy: > > * Assigning np.NA to an array, cannot be undone unless through explicit > assignment (i.e., assigning a new arbitrary value, or saving a copy of > the original array before assigning np.NA). > > * np.NA values propagate by default, unless ufuncs have the "skipna = > True" argument (or the other way around, it doesn't really matter to > this discussion). In order to avoid passing the argument on each > ufunc, we either have some per-array variable for the default "skipna" > value (undesirable) or we can make a trivial ndarray subclass that > will set the "skipna" argument on all ufuncs through the > "_ufunc_wrapper_" mechanism. > > > > Now, numpy has the concept of views, which adds some more goodies to the > list of concepts: > > * With views, two arrays can share the same physical data, so that > assignments to any of them will be seen by others (including NA > values). > > The creation of a view is explicitly stated by the user, so its > behaviour should not be perceived as odd (after all, you asked for a > view). > > The good thing is that with views you can avoid costly array copies if > you're careful when writing into these views. > > > > Now, you can add a new concept: local/temporal/transient missing data. > > We can take an existing array and create a view with the new argument > "transientna = True". > > This is already there: x.view(masked=1), although the keyword transientna has appeal, not least because it avoids the word 'mask', which seems a source of endless confusion. Note that currently this is only supposed to work if the original array is unmasked.
Here, both the view and the "transientna = True" are explicitly stated > by the user, so it is assumed that she already knows what this is all > about. > > The difference with a regular view is that you also explicitly asked for > local/temporal/transient NA values. > > * Assigning np.NA to an array view with "transientna = True" will > *not* be seen by any of the other views (nor the "original" array), > but anything else will still work "as usual". > > After all, this is what *you* asked for when using the "transientna = > True" argument. > > > > To conclude, say that others *must not* care about whether the arrays > they're working with have transient NA values. This way, I can create a > view with transient NAs, set to NA some uninteresting data, and pass it > to a routine written by someone else that sets to NA elements that, for > example, are beyond certain threshold from the mean of the elements. > > This would be equivalent to storing a copy of the original array before > passing it to this 3rd party function, only that "transientna", just as > views, provide some handy shortcuts to avoid copies. > > > My main point here is that views and local/temporal/transient NAs are > all *explicitly* requested, so that its behaviour should not appear as > something unexpected. > > Is there an agreement on this? > > Chuck
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
