Thanks Sebastian, I have your example running and will start experimenting with DType.
Lee On Thu, Mar 25, 2021 at 5:32 PM Sebastian Berg <sebast...@sipsolutions.net> wrote: > On Wed, 2021-03-17 at 17:12 -0500, Sebastian Berg wrote: > > On Wed, 2021-03-17 at 07:56 -0500, Lee Johnston wrote: > > <snip> > > > 3. In parallel, I will create a small "toy" DType based on that > > experimental API. Probably in a separate repo (in the NumPy > > organization?). > > > > So this is started. What you need to do right now if you want to try is > work of this branch in NumPy: > > > https://github.com/numpy/numpy/compare/main...seberg:experimental-dtype-api > > Install NumPy with `NPY_USE_NEW_CASTINGIMPL=1 python -mpip install .` > or your favorite alternative. > (The `NPY_USE_NEW_CASTINGIMPL=1` should be unnecessary very soon, > working of a branch and not "main" will hopefully also be unnecessary > soon.) > > > Then fetch: https://github.com/seberg/experimental_user_dtypes > and install it as well in the same environment. > > > After that, you can jump through the hoop of setting: > > NUMPY_EXPERIMENTAL_DTYPE_API=1 > > And you can enjoy these type of examples (while expecting hard crashes > when going too far beyond!): > > from experimental_user_dtypes import float64unit as u > import numpy as np > > F = np.array([u.Quantity(70., "Fahrenheit")]) > C = F.astype(u.Float64UnitDType("Celsius")) > print(repr(C)) > # array([21.11111111111115 °C], dtype='Float64UnitDType(degC)') > > m = np.array([u.Quantity(5., "m")]) > m_squared = u.multiply(m, m) > print(repr(m_squared)) > # array([25.0 m**2], dtype='Float64UnitDType(m**2)') > > # Or conversion to SI the long route: > pc = np.arange(5., dtype="float64").view(u.Float64UnitDType("pc")) > pc.astype(pc.dtype.si()) > # array([0.0 m, 3.085677580962325e+16 m, 6.17135516192465e+16 m, > # 9.257032742886974e+16 m, 1.23427103238493e+17 m], > # dtype='Float64UnitDType(m)') > > > Yes, the code has some horrible hacks around creating the DType, but > the basic mechanism i.e. "functions you need to implement" are not > expected to change lot. > > Right now, it forces you to use and implement the scalar `u.Quantity` > and the code sample uses it. But you can also do: > > np.arange(3.).view(u.Float64UnitDType("m")) > > I do have plans to "not have a scalar" so the 0-D result would still be > an array. But that option doesn't exist yet (and right now the scalar > is used for printing). > > > (There is also a `string_equal` "ufunc-like" that works on "S" dtypes.) > > Cheers, > > Sebastian > > > > PS: I need to figure out some details about how to create DTypes and > DType instances with regards to our stable ABI. The current "solution" > is some weird subclassing hoops which are probably not good. > > That is painful unfortunately and any ideas would be great :). > Unfortunately, it requires a grasp around the C-API and metaclassing... > > > > > > > Anyone using the API, should expect bugs, crashes and changes for a > > while. But hopefully will only require small code modifications when > > the API becomes public. > > > > My personal plan for a toy example is currently a "scaled integer". > > E.g. a uint8 where you can set a range `[min_double, max_double]` > > that > > it maps to (which makes the DType "parametric"). > > We discussed some other examples, such as a "modernized" rational > > DType, that could be nice as well, lets see... > > > > Units would be a great experiment, but seem a bit complex to me (I > > don't know units well though). So to keep it baby steps :) I would > > aim > > for doing the above and then we can experiment on Units together! > > > > > > Since it came up: I agree that a Python API would be great to have. > > It > > is something I firmly kept on the back-burner... It should not be > > very > > hard (if rudimentary), but unless it would help experiments a lot, I > > would tend to leave it on the back-burner for now. > > > > Cheers, > > > > Sebastian > > > > > > [1] Maybe a `uint8` storage that maps to evenly spaced values on a > > parametric range `[double_min, double_max]`. That seems like a good > > trade-off in complexity. > > > > > > > > > On Tue, Mar 16, 2021 at 4:11 PM Sebastian Berg < > > > sebast...@sipsolutions.net> > > > wrote: > > > > > > > On Tue, 2021-03-16 at 13:17 -0500, Lee Johnston wrote: > > > > > Is the work on NEP 42 custom DTypes far enough along to > > > > > experiment > > > > > with? > > > > > > > > > > > > > TL;DR: Its not quite ready, but if we work together I think we > > > > could > > > > experiment a fair bit. Mainly ufuncs are still limited (though > > > > not > > > > quite completely missing). The main problem is that we need to > > > > find a > > > > way to expose the currently private API. > > > > > > > > I would be happy to discuss this also in a call. > > > > > > > > > > > > ** The long story: ** > > > > > > > > There is one more PR related to casting, for which merge should > > > > be > > > > around the corner. And which would bring a lot bang to such an > > > > experiment: > > > > > > > > https://github.com/numpy/numpy/pull/18398 > > > > > > > > > > > > At that point, the new machinery supports (or is used for): > > > > > > > > * Array-coercion: `np.array([your_scalar])` or > > > > `np.array([1], dtype=your_dtype)`. > > > > > > > > * Casting (practically full support). > > > > > > > > * UFuncs do not quite work. But short of writing `np.add(arr1, > > > > arr2)` > > > > with your DType involved, you can try a whole lot. (see below) > > > > > > > > * Promotion `np.result_type` should work very soon, but probably > > > > isn't > > > > is not very relevant anyway until ufuncs are fully implemented. > > > > > > > > That should allow you to do a lot of good experimentation, but > > > > due > > > > to > > > > the ufunc limitation, maybe not well on "existing" python code. > > > > > > > > > > > > The long story about limitations is: > > > > > > > > We are missing exposure of the new public API. I think I should > > > > be > > > > able to provide a solution for this pretty quickly, but it might > > > > require working of a NumPy branch. (I will write another email > > > > about > > > > it, hopefully we can find a better solution.) > > > > > > > > > > > > Limitations for UFuncs: UFuncs are the next big project, so to > > > > try > > > > it > > > > fully you will need some patience, unfortunately. > > > > > > > > But, there is some good news! You can write most of the "ufunc" > > > > already, you just can't "register" it. > > > > So what I can already offer you is a "DType-specific UFunc", > > > > e.g.: > > > > > > > > unit_dtype_multiply(np.array([1.], > > > > dtype=Float64UnitDType("m")), > > > > np.array([2.], > > > > dtype=Float64UnitDtype("s"))) > > > > > > > > And get out `np.array([2.], dtype=Float64UnitDtype("m s"))`. > > > > > > > > But you can't write `np.multiple(arr1, arr2)` or `arr1 * arr2` > > > > yet. > > > > Both registration and "promotion" logic are missing. > > > > > > > > I admit promotion may be one of the trickiest things, but trying > > > > this a > > > > bit might help with getting a clearer picture for promotion as > > > > well. > > > > > > > > > > > > The main last limitation is that I did not replace or create > > > > "fallback" > > > > solutions and/or replacement for the legacy `dtype->f-><slots>` > > > > yet. > > > > This is not a serious limitation for experimentation, though. It > > > > might > > > > even make sense to keep some of them around and replace them > > > > slowly. > > > > > > > > > > > > And of course, all the small issues/limitations that are not > > > > fixed > > > > because nobody tried yet... > > > > > > > > > > > > > > > > I hope this doesn't scare you away, or at least not for long :/. > > > > It > > > > could be very useful to start experimentation soon to push things > > > > forward a bit quicker. And I really want to have at least an > > > > experimental version in NumPy 1.21. > > > > > > > > Cheers, > > > > > > > > Sebastian > > > > > > > > > > > > > Lee > > > > > _______________________________________________ > > > > > NumPy-Discussion mailing list > > > > > NumPy-Discussion@python.org > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion@python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion