On Mon, 2022-01-10 at 15:25 -0800, Stephan Hoyer wrote: > There are no C-level APIs for __array_function__ or __array_ufunc__, > so > yes, at a high-level Python methods will be invoked by NumPy. > > That said, NumPy's logic for handling __array_function__ and > __array_ufunc__ methods is written in highly optimized C. If you > wrote your > own __array_function__ and __array_ufunc__ methods on Quantity using > C, and > there should be very little overhead from Python. I would guess you > might > see a considerable speed-ups due to moving highly dynamic Python > logic into > a compiled language. >
There are some slowish Python-C interface boundaries that could be optimized away [1], although we should first do the generic optimizations that I mentioned before the break (at least for `__array_function__`) before considering that. Another approach is now becoming available (and experimenting on it could hasten it a lot), though, if that interests you. That is to use new user defined DTypes: If you use a dtype that holds the unit, there is no wrapping of the NumPy array at all! If done right, you stay within C and the overhead should be effectively indistinguishable from normal NumPy array operations. There is a very minimal proof of concept here (not designed to be fast or good): https://github.com/seberg/experimental_user_dtypes This works with NumPy main branch (albeit there is a small tweak needed right now [2]). I was hoping to look into a somewhat more full featured prototype soonish (the fact that it uses Cython turned out a bad idea). Currently, there is mainly one larger feature missing: A Unit DType should be able to call the existing NumPy loops conveniently (e.g. use the normal multiplication function but tag on unit handling). Cheers, Sebastian [1] Basically, just hopping back from C to Python (even if just to call C) is a bit sluggish when you look at small to mid-sized array operations. That could probably be avoided, but needs small API extension I think. I am not sure how worthwhile that is. [2] Unfortunately, I a merge conflict or so happened. So if you want to try that example code, you currently need to change one line in `numpy/include/numpy/experimental_dtype_api.h` to read: #define __EXPERIMENTAL_DTYPE_VERSION 3 (The version should be 3, not 2.) > > On Mon, Jan 10, 2022 at 12:56 PM Juan Luis Cano Rodríguez < > hello@juanlu.space> wrote: > > > Hi all, > > > > I am a long time user of astropy.units, which allows one to define > > quantities with physical units as follows: > > > > > > > from astropy import units as u > > > > > 10 << u.cm > > <Quantity 10. cm> > > > > > np.sqrt(4 << u.m ** 2) > > <Quantity 2. m> > > > > > ([1, 1, 0] << u.m) @ ([0, 10, 20] << u.cm / u.s) > > <Quantity 10. cm m / s> > > > > > (([1, 1, 0] << u.m) * ([0, 10, 20] << u.cm / u.s)).to(u.m ** > > > > > 2 / u.s) > > <Quantity [0. , 0.1, 0. ] m2 / s> > > > > The mechanism works by subclassing numpy.ndarray and leveraging > > __array_function__ support aka NEP 18. Internally it is something > > like this: > > > > > > > v = np.array(10, dtype=np.float64, copy=False, order=None, > > > > > subok=True, > > ndmin=0) > > > > > vu = v.view(u.Quantity) > > > > > vu._set_unit(u.cm) > > > > > vu > > <Quantity 10. cm> > > > > However, over the years I have been constantly annoyed by the fact > > that it > > is tremendously slow. I'm not critizing Astropy devs, the problem > > seems > > objectively difficult: although some code paths could be optimized > > at the > > cost of losing some syntactic sugar or breaking backwards > > compatibility, > > `isinstance` calls and introspection in general are slow. > > > > Setting aside the question of trying to make astropy.units faster > > (which > > may or may not be possible), I was thinking how feasible could it > > be to > > implement something similar, but using a compiled language instead > > (C, > > Cython, Rust, whatever) and leveraging "modern" dispatch > > mechanisms. But > > after reading about NEP 18, NEP 47, uarray, and various pull > > requests and > > issues here and there ( > > https://labs.quansight.org/blog/2021/11/pydata-extensibility-vision/ > > and > > https://github.com/scipy/scipy/issues/10204#issuecomment-787067947 > > among > > others) I don't fully grasp the differences between the approaches, > > and I > > don't know if what I am proposing is feasible at all. Since IIUC > > the numpy > > function or ufunc is passed to __array_function__ and > > __array_ufunc__ > > respectively, I am not sure how would that interact with the code > > being in > > a foreign language (I assume the NumPy C API would have to be > > used). > > > > If folks have advice, ideas, or suggestions for a direction, I'll > > be happy > > to read them. > > > > Best! > > _______________________________________________ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: sho...@gmail.com > > > _______________________________________________ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com