Re: byte and short data types use cases

2023-06-10 Thread Cecil Ward via Digitalmars-d-learn

On Sunday, 11 June 2023 at 00:05:52 UTC, H. S. Teoh wrote:
On Sat, Jun 10, 2023 at 09:58:12PM +, Cecil Ward via 
Digitalmars-d-learn wrote:

On Friday, 9 June 2023 at 15:07:54 UTC,



[...]

On contemporary machines, the CPU is so fast that memory access 
is a much bigger bottleneck than processing speed. So unless an 
operation is being run hundreds of thousands of times, you're 
not likely to notice the difference. OTOH, accessing memory is 
slow (that's why the memory cache hierarchy exists). So utf8 is 
actually advantageous here: it fits in a smaller space, so it's 
faster to fetch from memory; more of it can fit in the CPU 
cache, so less DRAM roundtrips are needed. Which is faster.  
Yes you need extra processing because of the variable-width 
encoding, but it happens mostly inside the CPU, which is fast 
enough that it generally outstrips the memory roundtrip 
overhead. So unless you're doing something *really* complex 
with the utf8 data, it's an overall win in terms of 
performance. The CPU gets to do what it's good at -- running 
complex code -- and the memory cache gets to do what it's good 
at: minimizing the amount of slow DRAM roundtrips.




I completely agree with H. S. Teoh. That is exactly what I was 
going to say. The point is that considerations like this have to 
be thought through carefully and width of types really does 
matter in the cases brought up.


But outside these cases, as I said earlier, stick to uint, size_t 
and ulong, or uint32_t and uint64_t if exact size is vital, but 
do also check out the other std.stdint types too as very 
occasionally they are needed.




Re: byte and short data types use cases

2023-06-10 Thread H. S. Teoh via Digitalmars-d-learn
On Sat, Jun 10, 2023 at 09:58:12PM +, Cecil Ward via Digitalmars-d-learn 
wrote:
> On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:
[...]
> > So you can optimize memory usage by using arrays of things smaller
> > than `int` if these are enough for your purposes, but what about
> > using these instead of single variables, for example as an iterator
> > in a loop, if range of such a data type is enough for me? Is there
> > any advantages on doing that?
> 
> A couple of other important use-cases came to me. The first one is
> unicode which has three main representations, utf-8 which is a stream
> of bytes each character can be several bytes, utf-16 where a character
> can be one or rarely two 16-bit words, and utf32 - a stream of 32-bit
> words, one per character. The simplicity of the latter is a huge deal
> in speed efficiency, but utf32 takes up almost four times as memory as
> utf-8 for western european languages like english or french. The
> four-to-one ratio means that the processor has to pull in four times
> the amount of memory so that’s a slowdown, but on the other hand it is
> processing the same amount of characters whichever way you look at it,
> and in utf8 the cpu is having to parse more bytes than characters
> unless the text is entirely ASCII-like.
[...]

On contemporary machines, the CPU is so fast that memory access is a
much bigger bottleneck than processing speed. So unless an operation is
being run hundreds of thousands of times, you're not likely to notice
the difference. OTOH, accessing memory is slow (that's why the memory
cache hierarchy exists). So utf8 is actually advantageous here: it fits
in a smaller space, so it's faster to fetch from memory; more of it can
fit in the CPU cache, so less DRAM roundtrips are needed. Which is
faster.  Yes you need extra processing because of the variable-width
encoding, but it happens mostly inside the CPU, which is fast enough
that it generally outstrips the memory roundtrip overhead. So unless
you're doing something *really* complex with the utf8 data, it's an
overall win in terms of performance. The CPU gets to do what it's good
at -- running complex code -- and the memory cache gets to do what it's
good at: minimizing the amount of slow DRAM roundtrips.


T

-- 
It said to install Windows 2000 or better, so I installed Linux instead.


Re: byte and short data types use cases

2023-06-10 Thread Cecil Ward via Digitalmars-d-learn

On Saturday, 10 June 2023 at 21:58:12 UTC, Cecil Ward wrote:

On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:

On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:

[...]


Is this some kind of property? Where can I read more about 
this?


My last example is comms. Protocol headers need economical narrow 
data types because of efficiency, it’s all about packing as much 
user data as possible into each packet and fatter, longer headers 
reduce the amount of user data as the total has a hard limit on 
it. A pair of headers totalling 40 bytes in IPv4+TCP takes up 
nearly 3% of the total length allowed, so that’s a ~3% speed 
loss, as the headers are just dead weight. So here narrow types 
help comms speed.


Re: byte and short data types use cases

2023-06-10 Thread Cecil Ward via Digitalmars-d-learn

On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:

On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:

On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:

If you have four ubyte variables in a struct and then
an array of them, then you are getting optimal memory usage.


Is this some kind of property? Where can I read more about this?

So you can optimize memory usage by using arrays of things 
smaller than `int` if these are enough for your purposes, but 
what about using these instead of single variables, for example 
as an iterator in a loop, if range of such a data type is 
enough for me? Is there any advantages on doing that?


A couple of other important use-cases came to me. The first one 
is unicode which has three main representations, utf-8 which is a 
stream of bytes each character can be several bytes, utf-16 where 
a character can be one or rarely two 16-bit words, and utf32 - a 
stream of 32-bit words, one per character. The simplicity of the 
latter is a huge deal in speed efficiency, but utf32 takes up 
almost four times as memory as utf-8 for western european 
languages like english or french. The four-to-one ratio means 
that the processor has to pull in four times the amount of memory 
so that’s a slowdown, but on the other hand it is processing the 
same amount of characters whichever way you look at it, and in 
utf8 the cpu is having to parse more bytes than characters unless 
the text is entirely ASCII-like.


The second use-case is about SIMD. Intel and AMD x86 machines 
have vector arithmetic units that are either 16, 32 or 64 bytes 
wide depending on how recent the model is. Taking for example a 
post-2013 Intel Haswell CPU, which has 32-byte wide units, if you 
choose smaller width data types you can fit more in the vector 
unit - that’s how it works, and fitting in more integers or 
floating point numbers of half width means that you can process 
twice as many in one instruction. On our Haswell that means four 
doubles or four quad words, or eight 32-bit floats or 32-bit 
uint32_ts, and similar doubling s’s for uint16_t. So here width 
economy directly relates to double speed.


Re: byte and short data types use cases

2023-06-10 Thread Salih Dincer via Digitalmars-d-learn

On Friday, 9 June 2023 at 23:51:07 UTC, Basile B. wrote:
Yes, a classsic resource is 
http://www.catb.org/esr/structure-packing/


So you can optimize memory usage by using arrays of things 
smaller than `int` if these are enough for your purposes,


So, is the sorting correct in a structure like the one below with 
partial overlap?


```d
struct DATA
{
union
{
ulong bits;
ubyte[size] cell;
}
enum size = 5;
bool last;
alias last this;

size_t length, limit, index = ulong.sizeof;

bool empty()
{
return index / ulong.sizeof >= limit; }

ubyte[] data;
ubyte front()
{
//..
```
This code snippet is from an actual working my project.  What is 
done is to process 40 bits of data.


SDB@79



Re: byte and short data types use cases

2023-06-09 Thread Cecil Ward via Digitalmars-d-learn

On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:

On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:

On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:

If you have four ubyte variables in a struct and then
an array of them, then you are getting optimal memory usage.


Is this some kind of property? Where can I read more about this?

So you can optimize memory usage by using arrays of things 
smaller than `int` if these are enough for your purposes, but 
what about using these instead of single variables, for example 
as an iterator in a loop, if range of such a data type is 
enough for me? Is there any advantages on doing that?


Read up on ‘structs’ and the ‘align’ attribute in the main d 
docs, on this website. Using smaller fields in a struct that is 
in memory saves RAM if there is an array of such structs. Even in 
the case where there is only one struct, let’s say that you are 
returning a struct by value from some function. If the struct is 
fairly small in total, and the compiler is good (ldc or gdc, not 
dmd - see godbolt.org) then the returned struct can fit into a 
register sometimes, rather than being placed in RAM, when it is 
returned to the function’s caller. Yesterday I returned a struct 
containing four uint32_t fields from a function and it came back 
to the caller in two 64-bit registers, not in RAM. Clearly using 
smaller fields if possible might make it possible for the whole 
struct to be under the size limit for being returned in registers.


As for your question about single variables. The answer is very 
definitely no. Rather, the opposite: always use primary 
CPU-‘natural’ types, widths that are most natural to the 
processor in question. 64-bit cpus will sometimes favour 32-bit 
types an example being x86-64/AMD64, where code handling 32-bit 
ints generates less code (saves bytes in the code segment) but 
the speed and number of instructions is the same on such a 64-bit 
processor where you’re dealing with 32- or 64- bit types. Always 
use size_t for index variables into arrays or the size of 
anything in bytes, never int or uint. On a 64-bit machine such as 
x86-64, size_t is 64-bit, not 32. By using int/uint when you 
should have used size_t you could in theory get a very rare bug 
when dealing with eg file sizes or vast amounts of (virtual) 
memory, say bigger than 2GB (int limit) or 4GB (uint limit) when 
the 32-bit types overflow. There is also a ptrdiff_t which is 
64-bit on a 64-bit cpu, probably not worth bothering with as its 
raison d’être was historical (early 80s 80286 segmented 
architecture, before the 32-bit 386 blew it away).


Re: byte and short data types use cases

2023-06-09 Thread Ali Çehreli via Digitalmars-d-learn

On 6/9/23 08:07, Murloc wrote:

> Where can I read more about this?

I had written something related:

  http://ddili.org/ders/d.en/memory.html#ix_memory..offsetof

The .offsetof appears at that point. The printObjectLayout() function 
example there attempts to visualize the layout of the members of a struct.


Ali



Re: byte and short data types use cases

2023-06-09 Thread Basile B. via Digitalmars-d-learn

On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:

On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:

On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:

If you have four ubyte variables in a struct and then
an array of them, then you are getting optimal memory usage.


Is this some kind of property? Where can I read more about this?


Yes, a classsic resource is 
http://www.catb.org/esr/structure-packing/


So you can optimize memory usage by using arrays of things 
smaller than `int` if these are enough for your purposes,


It's not for arrays, it's also for members

```d
struct S1
{
ubyte a; // offs 0
ulong b; // offs 8
ubyte c; // offs 16
}

struct S2
{
ubyte a; // offs 0
ubyte c; // offs 1
ulong b; // offs 8
}

static assert(S1.sizeof > S2.sizeof); // 24 VS 16
```

this is because you cant do unaligned reads for `b`, but you can 
for `a` and `c`.


but what about using these instead of single variables, for 
example as an iterator in a loop, if range of such a data type 
is enough for me? Is there any advantages on doing that?


Not really the loop variable takes a marginal part of the stack 
space in the current function. You can just use `auto` and let 
the compiler choose the best type.





Re: byte and short data types use cases

2023-06-09 Thread H. S. Teoh via Digitalmars-d-learn
On Fri, Jun 09, 2023 at 11:24:38AM +, Murloc via Digitalmars-d-learn wrote:
[...]
> Which raised another question: since objects of types smaller than
> `int` are promoted to `int` to use integer arithmetic on them anyway,
> is there any point in using anything of integer type less than `int`
> other than to limit the range of values that can be assigned to a
> variable at compile time?

Not just at compile time, at runtime they will also be fixed to that
width (mapped to a hardware register of that size) and will not be able
to contain a larger value.


[...]
> People say that there is no advantage for using `byte`/`short` type
> for integer objects over an int for a single variable, however, as
> they say, this is not true for arrays, where you can save some memory
> space by using `byte`/`short` instead of `int`.

That's correct.


> But isn't any further manipulations with these array objects will
> produce results of type `int` anyway? Don't you have to cast these
> objects over and over again after manipulating them to write them back
> into that array or for some other manipulations with these smaller
> types objects?

Yes you will have to cast them back.  Casting often translates to a
no-op or just a single instruction in the machine code; you just write
part of a 32-bit register back to memory instead of the whole thing, and
this automatically truncates the value to the narrow int.

The general advice is, perform computations with int or wider, then
truncate when writing back to storage for storage efficiency. So
generally you wouldn't cast the value to short/byte until the very end
when you're about to store the final result back to the array.  At that
point you'd probably also want to do a range check to catch any
potential overflows.


> Some people say that these promoting and casting operations in summary
> may have an even slower overall effect than simply using int, so I'm
> kind of confused about the use cases of these data types... (I think
> that my misunderstanding comes from not knowing how things happen at a
> slightly lower level of abstractions, like which operations require
> memory allocation, which do not, etc. Maybe some resource
> recommendations on that?) Thanks!

I highly recommend taking an introductory course to assembly language,
or finding a book / online tutorial on the subject.  Understanding how
the machine actually works under the hood will help answer a lot of
these questions, even if you'll never actually write a single line of
assembly code.

But in a nutshell: integer data types do not allocate, unless you
explicitly ask for it (e.g. `int* p = new int;` -- but you almost never
want to do this). They are held in machine registers or stored on the
runtime stack, and always occupy a fixed size, so almost no memory
management is needed for them. (Which is also why they're preferred when
you don't need anything more fancy, because they're also super-fast.)
Promoting an int takes at most 1 machine instruction, or, in the case of
unsigned values, sometimes zero instructions. Casting back to a narrow
int is often a no-op (the subsequent code just ignores the upper bits).
The performance difference is negligible, unless you're doing expensive
things like range checking after every operation (generally you don't
need to anyway, usually it's sufficient to check range at the end of a
computation, not at every intermediate step -- unless you have reason to
believe that an intermediate step is liable to overflow or wrap around).


T

-- 
People who are more than casually interested in computers should have at
least some idea of what the underlying hardware is like. Otherwise the
programs they write will be pretty weird. -- D. Knuth


Re: byte and short data types use cases

2023-06-09 Thread Murloc via Digitalmars-d-learn

On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:

On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:

If you have four ubyte variables in a struct and then
an array of them, then you are getting optimal memory usage.


Is this some kind of property? Where can I read more about this?

So you can optimize memory usage by using arrays of things 
smaller than `int` if these are enough for your purposes, but 
what about using these instead of single variables, for example 
as an iterator in a loop, if range of such a data type is enough 
for me? Is there any advantages on doing that?


Re: byte and short data types use cases

2023-06-09 Thread Cecil Ward via Digitalmars-d-learn

On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:
Hi, I was interested why, for example, `byte` and `short` 
literals do not have their own unique suffixes (like `L` for 
`long` or `u` for `unsigned int` literals) and found the 
following explanation:


- "I guess short literal is not supported solely due to the 
fact that anything less than `int` will be "promoted" to `int` 
during evaluation. `int` has the most natural size. This is 
called integer promotion in C++."


Which raised another question: since objects of types smaller 
than `int` are promoted to `int` to use integer arithmetic on 
them anyway, is there any point in using anything of integer 
type less than `int` other than to limit the range of values 
that can be assigned to a variable at compile time? Are these 
data types there because of some historical reasons (maybe 
`byte` and/or `short` were "natural" for some architectures 
before)?


People say that there is no advantage for using `byte`/`short` 
type for integer objects over an int for a single variable, 
however, as they say, this is not true for arrays, where you 
can save some memory space by using `byte`/`short` instead of 
`int`. But isn't any further manipulations with these array 
objects will produce results of type `int` anyway? Don't you 
have to cast these objects over and over again after 
manipulating them to write them back into that array or for 
some other manipulations with these smaller types objects? Or 
is this only useful if you're storing some array of constants 
for reading purposes?


Some people say that these promoting and casting operations in 
summary may have an even slower overall effect than simply 
using int, so I'm kind of confused about the use cases of these 
data types... (I think that my misunderstanding comes from not 
knowing how things happen at a slightly lower level of 
abstractions, like which operations require memory allocation, 
which do not, etc. Maybe some resource recommendations on 
that?) Thanks!


For me there are two use cases for using byte and short, ubyte 
and ushort.


The first is simply to save memory in a large array or neatly fit 
into a ‘hole’ in a struct, say next to a bool which is also a 
byte. If you have four ubyte variables in a struct and then an 
array of them, then you are getting optimal memory usage. In the 
x86 for example the casting operations for ubyte to uint use 
instructions that have zero added cost compared to a normal uint 
fetch. And casting to a ubyte generates no code at all. So the 
costs of casting in total are zero.


The second use-case is where you need to interface to external 
specifications that deman uint8_t (ubyte), or uint16_t (ushort) 
where I am using the standard definitions from std.stdint. These 
types are the in C. If you are interfacing to externally defined 
struct in data structures in ram or in messages, that’s one 
example. The second example is where you need to interface to 
machine code that has registers or operands of 8-bit or 16-bit 
types. I like to use the stdint types for the purposes of 
documentation as it rams home the point that these are truly 
fixed width types and can not change. (And I do know that in D, 
unlike C, int, long etc are of defined fixed widths. Since C 
doesn’t have those guarantees that’s why the C stdint.h is needed 
in C too.) As well as machine code, we could add other high-level 
languages where interfaces are defined in the other language and 
you have to hope that the other language’s type widths don’t 
change.


byte and short data types use cases

2023-06-09 Thread Murloc via Digitalmars-d-learn
Hi, I was interested why, for example, `byte` and `short` 
literals do not have their own unique suffixes (like `L` for 
`long` or `u` for `unsigned int` literals) and found the 
following explanation:


- "I guess short literal is not supported solely due to the fact 
that anything less than `int` will be "promoted" to `int` during 
evaluation. `int` has the most natural size. This is called 
integer promotion in C++."


Which raised another question: since objects of types smaller 
than `int` are promoted to `int` to use integer arithmetic on 
them anyway, is there any point in using anything of integer type 
less than `int` other than to limit the range of values that can 
be assigned to a variable at compile time? Are these data types 
there because of some historical reasons (maybe `byte` and/or 
`short` were "natural" for some architectures before)?


People say that there is no advantage for using `byte`/`short` 
type for integer objects over an int for a single variable, 
however, as they say, this is not true for arrays, where you can 
save some memory space by using `byte`/`short` instead of `int`. 
But isn't any further manipulations with these array objects will 
produce results of type `int` anyway? Don't you have to cast 
these objects over and over again after manipulating them to 
write them back into that array or for some other manipulations 
with these smaller types objects? Or is this only useful if 
you're storing some array of constants for reading purposes?


Some people say that these promoting and casting operations in 
summary may have an even slower overall effect than simply using 
int, so I'm kind of confused about the use cases of these data 
types... (I think that my misunderstanding comes from not knowing 
how things happen at a slightly lower level of abstractions, like 
which operations require memory allocation, which do not, etc. 
Maybe some resource recommendations on that?) Thanks!