On January 11, 2022, Sebastian Berg <sebast...@sipsolutions.net> wrote:
> On Mon, 2022-01-10 at 15:25 -0800, Stephan Hoyer wrote:
> > There are no C-level APIs for __array_function__ or __array_ufunc__,
> > so
> > yes, at a high-level Python methods will be invoked by NumPy.
> > 
> > That said, NumPy's logic for handling __array_function__ and
> > __array_ufunc__ methods is written in highly optimized C. If you
> > wrote your
> > own __array_function__ and __array_ufunc__ methods on Quantity using
> > C, and
> > there should be very little overhead from Python. I would guess you
> > might
> > see a considerable speed-ups due to moving highly dynamic Python
> > logic into
> > a compiled language.
> > 
>
> There are some slowish Python-C interface boundaries that could be
> optimized away [1], although we should first do the generic
> optimizations that I mentioned before the break (at least for
> `__array_function__`) before considering that.
>
> Another approach is now becoming available (and experimenting on it
> could hasten it a lot), though, if that interests you.
>
> That is to use new user defined DTypes: If you use a dtype that holds
> the unit, there is no wrapping of the NumPy array at all!
> If done right, you stay within C and the overhead should be
> effectively
> indistinguishable from normal NumPy array operations.
> There is a very minimal proof of concept here (not designed to be fast
> or good):
>
>  https://github.com/seberg/experimental_user_dtypes
>
> This works with NumPy main branch (albeit there is a small tweak
> needed
> right now [2]).

Thanks both for the insight! The dtype solution looks promising. I tried
to make it work without success, opened an issue in your repository to
continue the conversation there.

Best,

Juan Luis

> I was hoping to look into a somewhat more full featured prototype
> soonish (the fact that it uses Cython turned out a bad idea).
>
> Currently, there is mainly one larger feature missing: A Unit DType
> should be able to call the existing NumPy loops conveniently (e.g. use
> the normal multiplication function but tag on unit handling).
>
> Cheers,
>
> Sebastian
>
>
> [1] Basically, just hopping back from C to Python (even if just to
> call
> C) is a bit sluggish when you look at small to mid-sized array
> operations. That could probably be avoided, but needs small API
> extension I think. I am not sure how worthwhile that is.
>
> [2] Unfortunately, I a merge conflict or so happened. So if you want
> to
> try that example code, you currently need to change one line in
> `numpy/include/numpy/experimental_dtype_api.h` to read:
>
> #define __EXPERIMENTAL_DTYPE_VERSION 3
>
> (The version should be 3, not 2.)
>
> > 
> > On Mon, Jan 10, 2022 at 12:56 PM Juan Luis Cano Rodríguez <
> > hello@juanlu.space> wrote:
> > 
> > > Hi all,
> > > 
> > > I am a long time user of astropy.units, which allows one to define
> > > quantities with physical units as follows:
> > > 
> > > > > > from astropy import units as u
> > > > > > 10 << u.cm
> > > <Quantity 10. cm>
> > > > > > np.sqrt(4 << u.m ** 2)
> > > <Quantity 2. m>
> > > > > > ([1, 1, 0] << u.m) @ ([0, 10, 20] << u.cm / u.s)
> > > <Quantity 10. cm m / s>
> > > > > > (([1, 1, 0] << u.m) * ([0, 10, 20] << u.cm / u.s)).to(u.m **
> > > > > > 2 / u.s)
> > > <Quantity [0. , 0.1, 0. ] m2 / s>
> > > 
> > > The mechanism works by subclassing numpy.ndarray and leveraging
> > > __array_function__ support aka NEP 18. Internally it is something
> > > like this:
> > > 
> > > > > > v = np.array(10, dtype=np.float64, copy=False, order=None,
> > > > > > subok=True,
> > > ndmin=0)
> > > > > > vu = v.view(u.Quantity)
> > > > > > vu._set_unit(u.cm)
> > > > > > vu
> > > <Quantity 10. cm>
> > > 
> > > However, over the years I have been constantly annoyed by the fact
> > > that it
> > > is tremendously slow. I'm not critizing Astropy devs, the problem
> > > seems
> > > objectively difficult: although some code paths could be optimized
> > > at the
> > > cost of losing some syntactic sugar or breaking backwards
> > > compatibility,
> > > `isinstance` calls and introspection in general are slow.
> > > 
> > > Setting aside the question of trying to make astropy.units faster
> > > (which
> > > may or may not be possible), I was thinking how feasible could it
> > > be to
> > > implement something similar, but using a compiled language instead
> > > (C,
> > > Cython, Rust, whatever) and leveraging "modern" dispatch
> > > mechanisms. But
> > > after reading about NEP 18, NEP 47, uarray, and various pull
> > > requests and
> > > issues here and there (
> > > https://labs.quansight.org/blog/2021/11/pydata-extensibility-
> vision/
> > >  and
> > > https://github.com/scipy/scipy/issues/10204#issuecomment-
> 787067947 
> > > among
> > > others) I don't fully grasp the differences between the
> approaches,
> > > and I
> > > don't know if what I am proposing is feasible at all. Since IIUC
> > > the numpy
> > > function or ufunc is passed to __array_function__ and
> > > __array_ufunc__
> > > respectively, I am not sure how would that interact with the code
> > > being in
> > > a foreign language (I assume the NumPy C API would have to be
> > > used).
> > > 
> > > If folks have advice, ideas, or suggestions for a direction, I'll
> > > be happy
> > > to read them.
> > > 
> > > Best!
> > > _______________________________________________
> > > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > > To unsubscribe send an email to numpy-discussion-le...@python.org
> > > https://mail.python.org/mailman3/lists/numpy-
> discussion.python.org/
> > > Member address: sho...@gmail.com
> > > 
> > _______________________________________________
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: sebast...@sipsolutions.net


> _______________________________________________
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: hello@juanlu.space
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to