Hi Allan, This is very impressive! I could get the tests that I wrote for my class pass with yours using Quantity with what I would consider very minimal changes. I only could not find a good way to unmask data (I like the idea of setting the mask on some elements via `ma[item] = X`); is this on purpose?
Anyway, it would seem easily at the point where I should comment on your repository rather than in the mailing list! All the best, Marten On Wed, Jun 19, 2019 at 5:45 PM Allan Haldane <allanhald...@gmail.com> wrote: > On 6/18/19 2:04 PM, Marten van Kerkwijk wrote: > > > > > > On Tue, Jun 18, 2019 at 12:55 PM Allan Haldane <allanhald...@gmail.com > > <mailto:allanhald...@gmail.com>> wrote: > > <snip> > > > > > This may be too much to ask from the initializer, but, if so, it > still > > > seems most useful if it is made as easy as possible to do, say, > `class > > > MaskedQuantity(Masked, Quantity): <very few overrides>`. > > > > Currently MaskedArray does not accept ducktypes as underlying arrays, > > but I think it shouldn't be too hard to modify it to do so. Good > idea! > > > > > > Looking back at my trial, I see that I also never got to duck arrays - > > only ndarray subclasses - though I tried to make the code as agnostic as > > possible. > > > > (Trial at > > > https://github.com/astropy/astropy/compare/master...mhvk:utils-masked-class?expand=1 > ) > > > > I already partly navigated this mixin-issue in the > > "MaskedArrayCollection" class, which essentially does > > ArrayCollection(MaskedArray(array)), and only takes about 30 lines of > > boilerplate. That's the backwards encapsulation order from what you > want > > though. > > > > > > Yes, indeed, from a quick trial `MaskedArray(np.arange(3.) * u.m, > > mask=[True, False, False])` does indeed not have a `.unit` attribute > > (and cannot represent itself...); I'm not at all sure that my method of > > just creating a mixed class is anything but a recipe for disaster, > though! > > Based on your suggestion I worked on this a little today, and now my > MaskedArray more easily encapsulates both ducktypes and ndarray > subclasses (pushed to repo). Here's an example I got working with masked > units using unyt: > > [1]: from MaskedArray import X, MaskedArray, MaskedScalar > > [2]: from unyt import m, km > > [3]: import numpy as np > > [4]: uarr = MaskedArray([1., 2., 3.]*km, mask=[0,1,0]) > > [5]: uarr > > MaskedArray([1., X , 3.]) > [6]: uarr + 1*m > > MaskedArray([1.001, X , 3.001]) > [7]: uarr.filled() > > unyt_array([1., 0., 3.], 'km') > [8]: np.concatenate([uarr, 2*uarr]).filled() > unyt_array([1., 0., 3., 2., 0., 6.], '(dimensionless)') > > The catch is the ducktype/subclass has to rigorously follow numpy's > indexing rules, including distinguishing 0d arrays from scalars. For now > only I used unyt in the example above since it happens to be less strict > about dimensionless operations than astropy.units which trips up my > repr code. (see below for example with astropy.units). Note in the last > line I lost the dimensions, but that is because unyt does not handle > np.concatenate. To get that to work we need a true ducktype for units. > > The example above doesn't expose the ".units" attribute outside the > MaskedArray, and it doesn't print the units in the repr. But you can > access them using "filled". > > While I could make MaskedArray forward unknown attribute accesses to the > encapsulated array, that seems a bit dangerous/bug-prone at first > glance, so probably I want to require the user to make a MaskedArray > subclass to do so. I've just started playing with that (probably buggy), > and Ive attached subclass examples for astropy.unit and unyt, with some > example output below. > > Cheers, > Allan > > > > Example using the attached astropy unit subclass: > > >>> from astropy.units import m, km, s > >>> uarr = MaskedQ(np.ones(3), units=km, mask=[0,1,0]) > >>> uarr > MaskedQ([1., X , 1.], units=km) > >>> uarr.units > km > >>> uarr + (1*m) > MaskedQ([1.001, X , 1.001], units=km) > >>> uarr/(1*s) > MaskedQ([1., X , 1.], units=km / s) > >>> (uarr*(1*m))[1:] > MaskedQ([X , 1.], units=km m) > >>> np.add.outer(uarr, uarr) > MaskedQ([[2., X , 2.], > [X , X , X ], > [2., X , 2.]], units=km) > >>> print(uarr) > [1. X 1.] km m > > Cheers, > Allan > > > > > Even if this impossible, I think it is conceptually useful to think > > > about what the masking class should do. My sense is that, e.g., it > > > should not attempt to decide when an operation succeeds or not, > > but just > > > "or together" input masks for regular, multiple-input functions, > > and let > > > the underlying arrays skip elements for reductions by using `where` > > > (hey, I did implement that for a reason... ;-). In particular, it > > > suggests one should not have things like domains and all that (I > never > > > understood why `MaskedArray` did that). If one wants more, the > class > > > should provide a method that updates the mask (a sensible default > > might > > > be `mask |= ~np.isfinite(result)` - here, the class being masked > > should > > > logically support ufuncs and functions, so it can decide what > > "isfinite" > > > means). > > > > I agree it would be nice to remove domains. It would make life > easier, > > and I could remove a lot of twiddly code! I kept it in for now to > > minimize the behavior changes from the old MaskedArray. > > > > > > That makes sense. Could be separated out to a backwards-compatibility > > class later. > > > > > > > In any case, I would think that a basic truth should be that > > everything > > > has a mask with a shape consistent with the data, so > > > 1. Each complex numbers has just one mask, and setting `a.imag` > with a > > > masked array should definitely propagate the mask. > > > 2. For a masked array with structured dtype, I'd similarly say > > that the > > > default is for a mask to have the same shape as the array. But that > > > something like your collection makes sense for the case where one > > wants > > > to mask items in a structure. > > > > Agreed that we should have a single bool per complex or structured > > element, and the mask shape is the same as the array shape. That's > how I > > implemented it. But there is still a problem with complex.imag > > assignment: > > > > >>> a = MaskedArray([1j, 2, X]) > > >>> i = a.imag > > >>> i[:] = MaskedArray([1, X, 1]) > > > > If we make the last line copy the mask to the original array, what > > should the real part of a[2] be? Conversely, if we don't copy the > mask, > > what should the imag part of a[1] be? It seems like we might "want" > the > > masks to be OR'd instead, but then should i[2] be masked after we > just > > set it to 1? > > > > Ah, I see the issue now... Easiest to implement and closest in analogy > > to a regular view would be to just let it unmask a[2] (with whatever is > > in real; user beware!). > > > > Perhaps better would be to special-case such that `imag` returns a > > read-only view of the mask. Making `imag` itself read-only would prevent > > possibly reasonable things like `i[np.isclose(i, 0)] = 0` - but there is > > no reason this should update the mask. > > > > Still, neither is really satisfactory... > > > > > > > > > p.s. I started trying to implement the above "Mixin" class; will > > try to > > > clean that up a bit so that at least it uses `where` and push it > up. > > > > I played with "where", but didn't include it since 1.17 is not > released. > > To avoid duplication of effort, I've attached a diff of what I > tried. I > > actually get a slight slowdown of about 10% by using where... > > > > > > Your implementation is indeed quite similar to what I got in > > __array_ufunc__ (though one should "&" the where with ~mask). > > > > I think the main benefit is not to presume that whatever is underneath > > understands 0 or 1, i.e., avoid filling. > > > > > > If you make progress with the mixin, a push is welcome. I imagine a > > problem is going to be that np.isscalar doesn't work to detect duck > > scalars. > > > > I fear that in my attempts I've simply decided that only array scalars > > exist... > > > > -- Marten > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion