On Fri, 2022-11-11 at 14:55 +0100, Oscar Gustafsson wrote:
> Thanks! That does indeed look like a promising approach! And for sure
> it
> would be better to avoid having to reimplement the whole array-part
> and
> only focus on the data types. (If successful, my idea of a project
> would
> basically solve all the custom numerical types discussed, bfloat16,
> int2,
> int4 etc.)

OK, more below.  But unfortunately `int2` and `int4` *are* problematic,
because the NumPy array uses a byte-sized strided layout, so you would
have to store them in a full byte, which is probably not what you want.

I am always thinking of adding a provision for it in the DTypes so that
someone could use part of the NumPy machine to make an array that can
have non-byte sized strides, but the NumPy array itself is ABI
incompatible with storing these packed :(.

(I.e. we could plug that "hole" to allow making an int4 DType in NumPy,
but it would still have to take 1-byte storage space when put into a
NumPy array, so I am not sure there is much of a point.)

> 
> I understand that the following is probably a hard question to
> answer, but
> is it expected that there will be work done on this in the "near"
> future
> to fill any holes and possibly become more stable? For context, the
> current
> plan on my side is to propose this as a student project for the
> spring, so
> primarily asking for planning and describing the project a bit
> better.


Well, it depends on what you need.  With the exception above, I doubt
the "holes" will matter much practice unless you are targeting for a
polished release rather than experimentation.
But of course it may be that you run into something that is important
for you, but doesn't yet quite work.

I will note just dealing with the Python/NumPy C-API can be a fairly
steep learning curve, so you need someone comfortable to dive in and
budget a good amount of time for that part.
And yes, this is pretty new, so there may be stumbling stones (which I
am happy to discuss in NumPy issues or directly).

- Sebastian


> 
> BR Oscar
> 
> Den tors 10 nov. 2022 kl 15:13 skrev Sebastian Berg <
> sebast...@sipsolutions.net>:
> 
> > On Thu, 2022-11-10 at 14:55 +0100, Oscar Gustafsson wrote:
> > > Den tors 10 nov. 2022 kl 13:10 skrev Sebastian Berg <
> > > sebast...@sipsolutions.net>:
> > > 
> > > > On Thu, 2022-11-10 at 11:08 +0100, Oscar Gustafsson wrote:
> > > > > > 
> > > > > > I'm not an expert, but I never encountered rounding
> > > > > > floating
> > > > > > point
> > > > > > numbers
> > > > > > in bases different from 2 and 10.
> > > > > > 
> > > > > 
> > > > > I agree that this is probably not very common. More a
> > > > > possibility
> > > > > if
> > > > > one
> > > > > would supply a base argument to around.
> > > > > 
> > > > > However, it is worth noting that Matlab has the quant
> > > > > function,
> > > > > https://www.mathworks.com/help/deeplearning/ref/quant.html wh
> > > > > ich
> > > > > basically
> > > > > supports arbitrary bases (as a special case of an even more
> > > > > general
> > > > > approach). So there may be other use cases (although the
> > > > > example
> > > > > basically
> > > > > just implements around(x, 1)).
> > > > 
> > > > 
> > > > To be honest, hearing hardware design and data compression does
> > > > make me
> > > > lean towards it not being mainstream enough that inclusion in
> > > > NumPy
> > > > really makes sense.  But happy to hear opposing opinions.
> > > > 
> > > 
> > > Here I can easily argue that "all" computations are limited by
> > > finite
> > > word
> > > length and as soon as you want to see the effect of any type of
> > > format not
> > > supported out of the box, it will be beneficial. (Strictly, it
> > > makes
> > > more
> > > sense to quantize to a given number of bits than a given number
> > > of
> > > decimal
> > > digits, as we cannot represent most of those exactly.)  But I may
> > > not
> > > do
> > > that.
> > > 
> > > 
> > > > It would be nice to have more of a culture around ufuncs that
> > > > do
> > > > not
> > > > live in NumPy.  (I suppose at some point it was more difficult
> > > > to
> > > > do C-
> > > > extension, but that is many years ago).
> > > > 
> > > 
> > > I do agree with this though. And this got me realizing that maybe
> > > what I
> > > actually would like to do is to create an array-library with
> > > fully
> > > customizable (numeric) data types instead. That is, sort of, the
> > > proper way
> > > to do it, although the proposed approach is indeed simpler and in
> > > most
> > > cases will work well enough.
> > > 
> > > (Am I right in believing that it is not that easy to piggy-back
> > > custom data
> > > types onto NumPy arrays? Something different from using object as
> > > dtype or
> > > the "struct-like" custom approach using the existing scalar
> > > types.)
> > 
> > NumPy is pretty much fully customizeable (beyond just numeric data
> > types).
> > Admittedly, to not have weird edge cases and have more power you
> > have
> > to use the new API (NEP 41-43 [1]) and that is "experimental" and
> > may
> > have some holes.
> > "Experimental" doesn't mean it is expected to change significantly,
> > just that you can't ship your stuff broadly really.
> > 
> > The holes may matter for some complicated dtypes (custom memory
> > allocation, parametric...). But at this point many should be rather
> > fixable, so before you do your own give NumPy a chance?
> > 
> > - Sebastian
> > 
> > 
> > [1] https://numpy.org/neps/nep-0041-improved-dtype-support.html
> > 
> > > 
> > > BR Oscar Gustafsson
> > > _______________________________________________
> > > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > > To unsubscribe send an email to numpy-discussion-le...@python.org
> > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > > Member address: sebast...@sipsolutions.net
> > 
> > 
> > _______________________________________________
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: oscar.gustafs...@gmail.com
> > 
> _______________________________________________
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net



_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to