Re: [Numpy-discussion] New Indexing Methods Revival #N (subclasses!)
On So, 2016-09-04 at 11:20 -0400, Marten van Kerkwijk wrote: > Hi Sebastian, > > I haven't given this as much thought as it deserves, but thought I > would comment from the astropy perspective, where we both have direct > subclasses of `ndarray` (`Quantity`, `Column`, `MaskedColumn`) and > classes that store their data internally as ndarray (subclass) > instances (`Time`, `SkyCoord`, ...). > > One comment would be that if one were to introduce a special method, > one should perhaps think a bit more broadly, and capture more than > the > indexing methods with it. I wonder about this because for the > array-holding classes mentioned above, we initially just had > `__getitem__`, which got the relevant items from the underlying > arrays, and then constructed a new instance with those. But recently > we realised that methods like `reshape`, `transpose`, etc., require > essentially the same steps, and so we constructed a new > `ShapedLikeNDArray` mixin, which provides all of those [1] as long as > one defines a single `_apply` method. (Indeed, it turns out that the > same technique works without any real change for some numpy functions > such as `np.broadcast_to`.) > > That said, in the actual ndarray subclasses, we have not found a need > to overwrite any of the reshaping methods, since those methods are > all > handled OK via `__array_finalize__`. We do overwrite `__getitem__` > (and `item`) as we need to take care of scalars. And we would > obviously have to overwrite `oindex`, etc., as well, for the same > reason, so in that respect a common method might be useful. > > However, perhaps it is worth considering that the only reason we need > to overwrite them in the first place, unlike what is the case for all > the shape-changing methods, is that scalar output does not get put > through `__array_finalize__`. Might it be an idea to have the new > indexing methods return array scalars instead of normal ones so we > can > get rid of this? I did not realize the new numpys are special with the scalar handling? The indexing (already before 1.9. I believe) always goes through PyArray_ScalarReturn or so, which I thought was used by almost all functions. If you mean the attributes (oindex, etc.), they could behave a bit different of course, though not sure to what it extend it actually helps since that would also create disparity. If we implement a new special method (__numpy_getitem__), they definitely should behave slightly different in some places. One option might be to not even do the wrapping, but leave it to the subclass. However, if you have an array with arrays inside, knowing whether to return a scalar correctly would have to rely on inspecting the index object, which is why I suggested the indexer to give a few extra informations (such as this one). Of course, since the scalar return goes through a ScalarReturn function, that function could maybe also be tought to indicate the scalar to `__array_finalize__`/`__array_wrap__` (not sure what exactly applies). - Sebastian > All the best, > > Marten > > [1] https://github.com/astropy/astropy/blob/master/astropy/utils/misc > .py#L856 > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New Indexing Methods Revival #N (subclasses!)
Hi Sebastian, I haven't given this as much thought as it deserves, but thought I would comment from the astropy perspective, where we both have direct subclasses of `ndarray` (`Quantity`, `Column`, `MaskedColumn`) and classes that store their data internally as ndarray (subclass) instances (`Time`, `SkyCoord`, ...). One comment would be that if one were to introduce a special method, one should perhaps think a bit more broadly, and capture more than the indexing methods with it. I wonder about this because for the array-holding classes mentioned above, we initially just had `__getitem__`, which got the relevant items from the underlying arrays, and then constructed a new instance with those. But recently we realised that methods like `reshape`, `transpose`, etc., require essentially the same steps, and so we constructed a new `ShapedLikeNDArray` mixin, which provides all of those [1] as long as one defines a single `_apply` method. (Indeed, it turns out that the same technique works without any real change for some numpy functions such as `np.broadcast_to`.) That said, in the actual ndarray subclasses, we have not found a need to overwrite any of the reshaping methods, since those methods are all handled OK via `__array_finalize__`. We do overwrite `__getitem__` (and `item`) as we need to take care of scalars. And we would obviously have to overwrite `oindex`, etc., as well, for the same reason, so in that respect a common method might be useful. However, perhaps it is worth considering that the only reason we need to overwrite them in the first place, unlike what is the case for all the shape-changing methods, is that scalar output does not get put through `__array_finalize__`. Might it be an idea to have the new indexing methods return array scalars instead of normal ones so we can get rid of this? All the best, Marten [1] https://github.com/astropy/astropy/blob/master/astropy/utils/misc.py#L856 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Pull Request regarding meshgrid
https://github.com/numpy/numpy/pull/7984 Hi everybody, I created my first pull request for numpy and as mentioned in the numpy development workflow documentation I hereby post a link to it and a short description to the mailing list. Please take a look. I didn't find a good way to create a contour plot of data of the form: [(x1, y1, f(x1, y1)), (x2, y2, f(x2, y2)), ..., (xn, yn, f(xn, yn))]. In order to do a contour plot, one has to bring the data into the meshgrid format. One possibility would be complicated sorting and reshaping of the data, but this is not easily possible especially if values are missing (not all combinations of (x, y) contained in data). Another way, which is used in all tutorials about contour plotting, is to create the meshgrid beforehand and than apply the function to the meshgrid matrices: x = np.linspace(-3, 3, n) y = np.linspace(-3, 3, n) X, Y = np.meshgrid(x, y) Z = f(X, Y) plt.contourplot(X, Y, Z) But if one does not have the function but only the data, this is also no option. My function essentially creates a dictionary {(x1, y1): f(x1, y1), (x2, y2): f(x2, y2), ..., (xn, yn): f(xn, yn)} with the coordinate tuples as keys and function values as values. Then it creates a meshgrid from all unique x and y coordinates (X and Y). The dictionary is then used to create the matrix Z, filling in np.nan for all missing values. This allows to do the following, with x, y and z being the x, y coordinates and z being the according function value: plt.contourplot(*meshgridify(x, y, f=z)) Maybe there is a simpler solution, but I didn't find one. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New Indexing Methods Revival #N (subclasses!)
On So, 2016-09-04 at 14:10 +0200, Sebastian Berg wrote: > On Sa, 2016-09-03 at 21:08 +0200, Sebastian Berg wrote: > > > > Hi all, > > > > not that I am planning to spend much time on this right now, > > however, > > I > > did a small rebase of the stuff I had (did not push yet) on oindex > > and > > remembered the old problem ;). > > > > The one remaining issue I have with adding things like (except > > making > > the code prettier and writing tests): > > > > arr.oindex[...] # outer/orthogonal indexing > > arr.vindex[...] # Picking of elements (much like current) > > arr.lindex[...] # current behaviour for backward compat > > > > is what to do about subclasses. Now what I can do (and have > > currently > > in my branch) is to tell someone on `subclass.oindex[...]`: This > > won't > > work, the subclass implements `__getitem__` or `__setitem__` so I > > don't > > know if the result would be correct (its a bit annoying if you also > > warn about using those attributes, but...). > > > Hmm, I am considering to expose a new indexing helper object. So that > subclasses could implement something like `__numpy_getitem__` and > `__numpy_setitem__` and if they do (and preferably nothing else) they > would get back passed a small object with some information about the > indexing operation. So that the subclass would implement: > > ``` > def __numpy_setitem__(self, indexer, values): > indexer.method # one of {"plain", "oindex", "vindex", "lindex"} > indexer.scalar # Will the result be a scalar? > indexer.view # Will the result be a view or a copy? > # More information might be possible (note that not all checks > are > # done at this point, just basic checks will have happened > already). > > # Do some code, that prepares self or values, could also use > # indexer for another array (e.g. mask) of the same shape. > > result = indexer(self, values) > > # Do some coded to fixup the result if necessary. > # Should discuss whether result is first a plain ndarray or > # already wrapped. > ``` Hmm, field access is a bit annoying, but I guess can/has to be included. > > This could be implemented in the C-side without much hassle, I think. > Of course it adds some new API which we would have to support > indefinitely. But it seems something like this would also fix the > hassle of identifying e.g. if the result should be a scalar for a > subclass (which may even be impossible in some cases). > > Would be very happy about feedback from subclassers! > > - Sebastian > > > > > > However, with or without such error, we need a nice way for > > subclasses > > to define these attributes! This is important even within numpy at > > least for masked arrays (possibly also matrix and memmap). > > > > They (typically) do some stuff before or after the plain indexing > > operation, so how do we make it convenient to allow them to do the > > same > > stuff for the special indexing attributes without weird code > > duplication? I can think of things, but nothing too great yet so > > maybe > > you guys got an elegant idea. > > > > - Sebastian > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New Indexing Methods Revival #N (subclasses!)
On Sa, 2016-09-03 at 21:08 +0200, Sebastian Berg wrote: > Hi all, > > not that I am planning to spend much time on this right now, however, > I > did a small rebase of the stuff I had (did not push yet) on oindex > and > remembered the old problem ;). > > The one remaining issue I have with adding things like (except making > the code prettier and writing tests): > > arr.oindex[...] # outer/orthogonal indexing > arr.vindex[...] # Picking of elements (much like current) > arr.lindex[...] # current behaviour for backward compat > > is what to do about subclasses. Now what I can do (and have currently > in my branch) is to tell someone on `subclass.oindex[...]`: This > won't > work, the subclass implements `__getitem__` or `__setitem__` so I > don't > know if the result would be correct (its a bit annoying if you also > warn about using those attributes, but...). > Hmm, I am considering to expose a new indexing helper object. So that subclasses could implement something like `__numpy_getitem__` and `__numpy_setitem__` and if they do (and preferably nothing else) they would get back passed a small object with some information about the indexing operation. So that the subclass would implement: ``` def __numpy_setitem__(self, indexer, values): indexer.method # one of {"plain", "oindex", "vindex", "lindex"} indexer.scalar # Will the result be a scalar? indexer.view # Will the result be a view or a copy? # More information might be possible (note that not all checks are # done at this point, just basic checks will have happened already). # Do some code, that prepares self or values, could also use # indexer for another array (e.g. mask) of the same shape. result = indexer(self, values) # Do some coded to fixup the result if necessary. # Should discuss whether result is first a plain ndarray or # already wrapped. ``` This could be implemented in the C-side without much hassle, I think. Of course it adds some new API which we would have to support indefinitely. But it seems something like this would also fix the hassle of identifying e.g. if the result should be a scalar for a subclass (which may even be impossible in some cases). Would be very happy about feedback from subclassers! - Sebastian > However, with or without such error, we need a nice way for > subclasses > to define these attributes! This is important even within numpy at > least for masked arrays (possibly also matrix and memmap). > > They (typically) do some stuff before or after the plain indexing > operation, so how do we make it convenient to allow them to do the > same > stuff for the special indexing attributes without weird code > duplication? I can think of things, but nothing too great yet so > maybe > you guys got an elegant idea. > > - Sebastian > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion