On Wed, Jun 29, 2011 at 1:51 PM, Lluís <[email protected]> wrote: > Mark Wiebe writes: > [...] > > I think that deciding on the value of NA signal values boils down to > > this question: should 3rd party code be able to interpret missing > data > > information stored in the separate mask array? > > > I'm tossing around some variations of ideas using the iterator to > > provide a buffered mask-based interface that works uniformly with both > > masked arrays and NA dtypes. This way 3rd party C code only needs to > > implement one missing data mechanism to fully support both of NumPy's > > missing data mechanisms. > > Nice. If non-numpy C code is bound to see it as an array (i.e., _always_ > oblivious to the mask concept), then you should probably do what I said > about "(un)merging" the bit pattern and mask-based NAs, but in this case > can be done on each block given by the iteration window. >
My hands are a little bit tied because of ABI compatibility, but I'm thinking of ways I can cause 3rd party C code to fail if it doesn't ask for the data with the mask when it's masked. There's still the possibility of giving a finer granularity interface > where both are explicitly accessed, but this will probably add yet > another set of API functions (although the merging interface can be > implemented on top of this explicit raw iteration interface). > Things should be as simple as possible, but having layers of lower level stuff and higher level stuff is good. This is why, for instance, I introduced the where= parameter to ufuncs, because it's another useful way of using the same low-level mechanisms. BTW, this has some overlapping with a mail Travis sent long ago about > dynamically filling the backing byffer contents (in this case with the > "merged" NA data for 3rd parties). > > It might prove completely unsatisfactory (w.r.t. performance), but you > could also fake a bit-pattern-only sequential array by using mprotect to > detect the memory accesses and trigger then the production of the merged > data. This provides means for code using the simple buffer protocol, > without duplicating the whole structure for NA merges. > > This can be complicated even more with some simple strided pattern > detection to diminish the number of segfaults, as the shape is known. > Someone else will have to do stuff like this... ;) -Mark > > > Lluis > > -- > "And it's much the same thing with knowledge, for whenever you learn > something new, the whole world becomes that much richer." > -- The Princess of Pure Reason, as told by Norton Juster in The Phantom > Tollbooth > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
