On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:
On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:
On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:

If you have four ubyte variables in a struct and then
an array of them, then you are getting optimal memory usage.

Is this some kind of property? Where can I read more about this?

So you can optimize memory usage by using arrays of things smaller than `int` if these are enough for your purposes, but what about using these instead of single variables, for example as an iterator in a loop, if range of such a data type is enough for me? Is there any advantages on doing that?

A couple of other important use-cases came to me. The first one is unicode which has three main representations, utf-8 which is a stream of bytes each character can be several bytes, utf-16 where a character can be one or rarely two 16-bit words, and utf32 - a stream of 32-bit words, one per character. The simplicity of the latter is a huge deal in speed efficiency, but utf32 takes up almost four times as memory as utf-8 for western european languages like english or french. The four-to-one ratio means that the processor has to pull in four times the amount of memory so that’s a slowdown, but on the other hand it is processing the same amount of characters whichever way you look at it, and in utf8 the cpu is having to parse more bytes than characters unless the text is entirely ASCII-like.

The second use-case is about SIMD. Intel and AMD x86 machines have vector arithmetic units that are either 16, 32 or 64 bytes wide depending on how recent the model is. Taking for example a post-2013 Intel Haswell CPU, which has 32-byte wide units, if you choose smaller width data types you can fit more in the vector unit - that’s how it works, and fitting in more integers or floating point numbers of half width means that you can process twice as many in one instruction. On our Haswell that means four doubles or four quad words, or eight 32-bit floats or 32-bit uint32_ts, and similar doubling s’s for uint16_t. So here width economy directly relates to double speed.

Reply via email to