On Saturday, 19 August 2023 at 19:23:38 UTC, Cecil Ward wrote:
I’m trying to write a cross-platform function that gives access
to the CPU’s prefetch instructions such as x86
prefetch0/1/2/prefetchnta and AAarch64 too. I’ve found that the
GDC and LDC compilers provide builtin magic functions for this,
and are what I need. I am trying to put together a
plain-English detailed spec for the respective builtin magic
functions.
My questions:
Q1) I need to compare the spec for the GCC and LDC builtin
magic functions’ "locality" parameter. Can anyone tell me if
GDC and LDC have kept mutual compatibility here?
I'd have thought GCC and LLVM have mutual compatibility thanks to
a common target API in Intel's `_mm_prefetch()` function (and in
fact, the magic locality numbers match `_MM_HINT_*` constants).
```
#define _MM_HINT_T0 1
#define _MM_HINT_T1 2
#define _MM_HINT_T2 3
#define _MM_HINT_NTA 0
```
Q2) Could someone help me turn the GCC and LDC specs into
english regarding the locality parameter ? - see (2) and (4)
below.
https://gcc.gnu.org/projects/prefetch.html
Q3) Does the locality parameter determine which _level_ of the
data cache hierarchy data is fetched into? Or is it always
fetched into L1 data cache and the outer ones, and this
parameter affects caches’ _future behaviour_?
It really depends on the CPU, and what features it has.
x86 SSE intrinsics are described in the x86 instruction manual,
along with the meaning of T[012], and NTA.
https://www.felixcloutier.com/x86/prefetchh
Q3) Will these magic builtins work on AAarch64?
It'll work on all targets that define a prefetch insn, or it'll
be a no-op. Similarly one or both read-write or locality
arguments might be ignored too.