On 15/04/2024 13:00, Richard Biener wrote:
On Mon, Apr 15, 2024 at 12:04 PM Tobias Burnus <tbur...@baylibre.com> wrote:
I experimented with some variants to make clearer that each of RDNA2 and
RNDA3 applies to two card types, but at the end I settled on the
fewest-word version.
Comments, remarks, suggestions? (To this change or in general?)
Current version: https://gcc.gnu.org/gcc-14/changes.html#amdgcn
Compiler flags, listing the the gfx* cards:
https://gcc.gnu.org/onlinedocs/gcc/AMD-GCN-Options.html
Tobias
PS: On the compiler side, I am looking forward to a .def file which
reduces the number of files to change when adding a new gfx* card, given
that we have doubled the number of entries. [Well, 1 missing but I know
of one WIP addition.]
I do wonder whether hot-patching the ELF header from the libgomp plugin
with the actual micro-subarch would be possible to make the driver happy.
We do query the device ISA when initializing the device so we should
be able to massage the ELF header of the object in GOMP_OFFLOAD_load_image
at least within some constraints (ideally we'd mark the ELF object as to
be matched with a device in some group).
This might work in some limited cases, especially if you limit the
codegen to some subset of the ISA, but in general the metadata on the
kernel entry-points is device-specific. For example, the gfx908 and
gfx90a have different granularity on the VGPR count settings. It would
probably be possible to generate some matching sets.
However, there's probably no need to do that ourselves because the LLVM
tools now have new generic ELF flags "gfx9-generic", "gfx10-1-generic",
"gfx10-3-generic", and "gfx11-generic" which supposedly do what you
want. I've not experimented with them. I don't know if libraries can
have the generic variant and still link with the specific variant (the
only libraries with kernel entry-points are the libgcc init_array and
fini_array). If not it becomes yet another multilib.
I'm very sure there's no one binary that will run anywhere for real
usecases.
Andrew