Hi, On Thu, Jun 30, 2011 at 6:51 PM, Nathaniel Smith <n...@pobox.com> wrote: > On Thu, Jun 30, 2011 at 6:31 AM, Matthew Brett <matthew.br...@gmail.com> > wrote: >> In the interest of making the discussion as concrete as possible, here >> is my draft of an alternative proposal for NAs and masking, based on >> Nathaniel's comments. Writing it, it seemed to me that Nathaniel is >> right, that the ideas become much clearer when the NA idea and the >> MASK idea are separate. Please do pitch in for things I may have >> missed or misunderstood: > [...] > > Thanks for writing this up! I stuck it up as a gist so we can edit it > more easily: > https://gist.github.com/1056379/ > This is your initial version: > https://gist.github.com/1056379/c809715f4e9765db72908c605468304ea1eb2191 > And I made a few changes: > https://gist.github.com/1056379/33ba20300e1b72156c8fb655bd1ceef03f8a6583 > Specifically, I added a rationale section, changed np.MASKED to > np.IGNORE (as per comments in this thread), and added a vowel to > "propmsk".
Thanks for doing that. > One thing I wonder about the design is whether having an > np.MASKED/np.IGNORE value at all helps or hurts. (Occam tells us never > to multiply entities without necessity! And it's a bit of an odd fit > to the masking concept, since the whole idea is that masking is a > property of the array, not the individual datums.) > > Currently, I see the following uses for it: > -- As a return value when someone tries to scalar-index a masked value > -- As a placeholder to specify masked values when creating an array > from a list (but not when assigning to an array later) > -- As a return value when using propmask=True > -- As something to display when printing a masked array > > Another way of doing things would be: > -- Scalar-indexing a masked value returns an error, like trying to > index past the end of an array. (Slicing etc. would still return a new > masked array.) > -- Having some sort of placeholder does seem nice, but I'm not sure > how often you need to type out a masked array. And I notice that > numpy.ma does support this (like so: ma.array([1, ma.masked, 3])) but > the examples in the docs never use it. The replacement idiom would be > something like: my_data = np.array([1, 999, 3], masked=True); > my_data.visible = (my_data != 999). So maybe just leave out the > placeholder value, at least for version 1? > -- I don't really see the logic for supporting 'propmask' at all. > AFAICT no-one has ever even considered this as a useful feature for > numpy.ma, never mind implemented it? > -- When printing, the numpy.ma approach of using "--" seems much > more readable than me than having "IGNORE" all over my screen. > > So overall, making these changes would let us simplify the design. But > maybe propmask is really critical for some use case, or there's some > good reason to want to scalar-index missing values without getting an > error? I'm afraid, like you, I'm a little lost in the world of masking, because I only need the NAs. I was trying to see if I could come up with an API that picked up some of the syntactic convenience of NAs, without conflating NAs with IGNOREs. I guess we need some feedback from the 'NA & IGNORE Share the API' (NISA?) proponents to get an idea of what we've missed. @Mark, @Chuck, guys - what have we lost here by separating the APIs? See you, Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion