On Fri, 2021-01-15 at 18:38 +0000, Israel, Daniel M wrote: > I hope this is the right place to post this. > > The numpy documentation talks about two methods for making ndarray- > like objects, subclassing and dispatching, but it is not clear to me > which one is most appropriate for which purpose. Can someone > provide, or point me to, some guidance, about this? I’m particularly > interested in what happens if there are multiple layers of > subclassing. Can you subclass from a subclass? Dispatch from a > dispatch? Subclass from a dispatch and vice versa?
All of those things can be made to work with appropriate use of `super()`. Subclassing and dispatching are not exclusive (an example is astropy.quantitile`). If you want to go well beyond typical NumPy behaviour, I would suggest to focus on dispatching. If all you want is to add a single method, subclassing should be a pretty good fit. (Assuming you don't mind if some operations may end up giving you a normal array, or return your array when a normal array would fit better.) For example, MaskedArray in NumPy is a subclass, but adds so much additional things that dispatching without subclassing is likely a better fit. (Opinions will probably differ; I expect using subclassing some things will "just work". However, sometimes the things that "just work" may also do the wrong thing). Ignoring the mask of a MaskedArray is always a serious issues. > My specific application is a pair of classes, SpectralArray and > PhysicalArray that uses numpy.fft to provides a to_physical() and > to_spectral() method, respectively, to simplify writing pseudo- > spectral codes. Initially this will be serial, but the > implementation will eventually use a mechanism similar to mpi4py-fft > to allow the arrays to be distributed. Further, it would be nice to > be able to make the code interoperable with the cupy CUDA numpy > implementation, so that the sub array on each MPI process could use > GPU accelerated FFTs. It sounds like you mostly want to add a set of method, so making a MixIn class and using subclassing may well be a good option. You can still add `__array_function__` or `__array_ufunc__` with a fallback to `super()` to override specific functions. If there is more to it (e.g. metadata for frequency scales or similar), it may be better to skip subclassing altogether. (Just to mention: in such a case `xarray` may be interesting.) Since you are also looking for distributed arrays, you should probably look into Dask (I do not know `mpyi4py` though). Dask arrays consist of distributed NumPy or CuPy arrays and make use of the dispatching in NumPy. Note that NumPy arrays cannot be distributed or gpu backed, and you cannot add using a subclass. So if that is the aim, do not subclass ndarray unless you were prepared to create multiple (sub)classes (ndarray, dask array, cupy array). Cheers, Sebastian > > Advice? Thanks. > > — > Daniel M. Israel, Ph. D. > XCP-4: Methods & Algorithms > Mailstop F644 > Los Alamos National Laboratory > 505 665 5664<tel:505%20665%205664> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion