On 23/06/2025 22:39, Tobias Burnus wrote:
This is more based on documentation reading that on testing
as still only limited MI300 testing has been done and seemingly
this code does not usually get touched.
MI300's "9.1.10 Memory Scope and Temporal Control" distinguishes
between scalar memory (9.1.10.1) for which a single control bit exists:
GLC (Globally Coherent) [+ dlc, slc, scc, but not used by MI300].
And, for vector memory (9.1.10.2; flat, global, scratch, buffer),
there is the system cache level SC[1:0] (wave, group, device system)
and also NT (non temporal).
This patch moves back to 'glc' for scalar memory access.
OK for mainline?
Tobias
PS: Some more smaller fixes are in the pipeline and there are some
known MI300 issues, not all fully understood. Likewise in the
(to-do) pipeline is more more in depth testing.
You still seem to have the unrelated preload bits in this patch, but
other than that, this looks fine.
In principle, we could use %Gn everywhere and use the address space from
the MEM to determine which cache to use, but that's probably overkill
until we need it.
Andrew