Re: byte and short data types use cases

Cecil Ward via Digitalmars-d-learn Sat, 10 Jun 2023 15:01:53 -0700

On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:

On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:
On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:
If you have four ubyte variables in a struct and then
an array of them, then you are getting optimal memory usage.
Is this some kind of property? Where can I read more about this?
So you can optimize memory usage by using arrays of thingssmaller than `int` if these are enough for your purposes, butwhat about using these instead of single variables, for exampleas an iterator in a loop, if range of such a data type isenough for me? Is there any advantages on doing that?

A couple of other important use-cases came to me. The first oneis unicode which has three main representations, utf-8 which is astream of bytes each character can be several bytes, utf-16 wherea character can be one or rarely two 16-bit words, and utf32 - astream of 32-bit words, one per character. The simplicity of thelatter is a huge deal in speed efficiency, but utf32 takes upalmost four times as memory as utf-8 for western europeanlanguages like english or french. The four-to-one ratio meansthat the processor has to pull in four times the amount of memoryso that’s a slowdown, but on the other hand it is processing thesame amount of characters whichever way you look at it, and inutf8 the cpu is having to parse more bytes than characters unlessthe text is entirely ASCII-like.

The second use-case is about SIMD. Intel and AMD x86 machineshave vector arithmetic units that are either 16, 32 or 64 byteswide depending on how recent the model is. Taking for example apost-2013 Intel Haswell CPU, which has 32-byte wide units, if youchoose smaller width data types you can fit more in the vectorunit - that’s how it works, and fitting in more integers orfloating point numbers of half width means that you can processtwice as many in one instruction. On our Haswell that means fourdoubles or four quad words, or eight 32-bit floats or 32-bituint32_ts, and similar doubling s’s for uint16_t. So here widtheconomy directly relates to double speed.

Re: byte and short data types use cases

Reply via email to