Yup yup, good point. So my array sizes in this case are 3e8. Thus, 32bit ints would be needed. So it is not a solution for this case.
Nevertheless, such concept would still be worthwhile for cases where integers are say max 256bits (or unlimited), then even if memory addresses or offsets are 64bit. This would both: a) save memory if many of values in array are much smaller than 256bits b) provide a standard for dynamically unlimited size values — For now, what could be a temporary solution for me, is a type, which stays at minimum/maximum when it goes below, above bounds. Integer types don’t work here at all - np.uint8(255) + 2 = 1. Totally unacceptable Floats are a bit better: np.float16(65500) + 100 = np.float16(inf). At least it didn’t reset and it went the right way (just a bit too much). > On 13 Mar 2024, at 18:26, Matti Picus <matti.pi...@gmail.com> wrote: > > I am not sure what kind of a scheme would support various-sized native ints. > Any scheme that puts pointers in the array is going to be worse: the pointers > will be 64-bit. You could store offsets to data, but then you would need to > store both the offsets and the contiguous data, nearly doubling your storage. > What shape are your arrays, that would be the minimum size of the offsets? > > Matti > > > On 13/3/24 18:15, Dom Grigonis wrote: >> By the way, I think I am referring to integer arrays. (Or integer part of >> floats.) >> >> I don’t think what I am saying sensibly applies to floats as they are. >> >> Although, new float type could base its integer part on such concept. >> >> — >> >> Where I am coming from is that I started to hit maximum bounds on integer >> arrays, where most of values are very small and some become very large. And >> I am hitting memory limits. And I don’t have many zeros, so sparse arrays >> aren’t an option. >> >> Approximately: >> 90% of my arrays could fit into `np.uint8` >> 1% requires `np.uint64` >> the rest 9% are in between. >> >> And there is no predictable order where is what, so splitting is not an >> option either. >> >> >>> On 13 Mar 2024, at 17:53, Nathan <nathan.goldb...@gmail.com> wrote: >>> >>> Yes, an array of references still has a fixed size width in the array >>> buffer. You can think of each entry in the array as a pointer to some other >>> memory on the heap, which can be a dynamic memory allocation. >>> >>> There's no way in NumPy to support variable-sized array elements in the >>> array buffer, since that assumption is key to how numpy implements strided >>> ufuncs and broadcasting., >>> >>> On Wed, Mar 13, 2024 at 9:34 AM Dom Grigonis <dom.grigo...@gmail.com> wrote: >>> >>> Thank you for this. >>> >>> I am just starting to think about these things, so I appreciate >>> your patience. >>> >>> But isn’t it still true that all elements of an array are still >>> of the same size in memory? >>> >>> I am thinking along the lines of per-element dynamic memory >>> management. Such that if I had array [0, 1e10000], the first >>> element would default to reasonably small size in memory. >>> >>>> On 13 Mar 2024, at 16:29, Nathan <nathan.goldb...@gmail.com> wrote: >>>> >>>> It is possible to do this using the new DType system. >>>> >>>> Sebastian wrote a sketch for a DType backed by the GNU >>>> multiprecision float library: >>>> https://github.com/numpy/numpy-user-dtypes/tree/main/mpfdtype >>>> >>>> It adds a significant amount of complexity to store data outside >>>> the array buffer and introduces the possibility of >>>> use-after-free and dangling reference errors that are impossible >>>> if the array does not use embedded references, so that’s the >>>> main reason it hasn’t been done much. >>>> >>>> On Wed, Mar 13, 2024 at 8:17 AM Dom Grigonis >>>> <dom.grigo...@gmail.com> wrote: >>>> >>>> Hi all, >>>> >>>> Say python’s builtin `int` type. It can be as large as >>>> memory allows. >>>> >>>> np.ndarray on the other hand is optimized for vectorization >>>> via strides, memory structure and many things that I >>>> probably don’t know. Well the point is that it is convenient >>>> and efficient to use for many things in comparison to >>>> python’s built-in list of integers. >>>> >>>> So, I am thinking whether something in between exists? (And >>>> obviously something more clever than np.array(dtype=object)) >>>> >>>> Probably something similar to `StringDType`, but for >>>> integers and floats. (It’s just my guess. I don’t know >>>> anything about `StringDType`, but just guessing it must be >>>> better than np.array(dtype=object) in combination with >>>> np.vectorize) >>>> >>>> Regards, >>>> dgpb >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list -- numpy-discussion@python.org >>>> To unsubscribe send an email to >>>> numpy-discussion-le...@python.org >>>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ >>>> Member address: nathan12...@gmail.com >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list -- numpy-discussion@python.org >>>> To unsubscribe send an email to numpy-discussion-le...@python.org >>>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ >>>> Member address: dom.grigo...@gmail.com >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list -- numpy-discussion@python.org >>> To unsubscribe send an email to numpy-discussion-le...@python.org >>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ >>> Member address: nathan12...@gmail.com >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list -- numpy-discussion@python.org >>> To unsubscribe send an email to numpy-discussion-le...@python.org >>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ >>> Member address: dom.grigo...@gmail.com >> >> >> _______________________________________________ >> NumPy-Discussion mailing list -- numpy-discussion@python.org >> To unsubscribe send an email to numpy-discussion-le...@python.org >> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ >> Member address: matti.pi...@gmail.com > _______________________________________________ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: dom.grigo...@gmail.com _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com