On Saturday, 1 August 2020 at 23:08:38 UTC, Chad Joan wrote:
Though if the compiler is allowed to split a single uint64_t
into two registers, I would expect it to split struct/string
into two registers as well. At least, the manual doesn't seem
to explicitly mention higher-level constructs like structs. It
does suggest a one-to-one relationship between arguments and
registers (up to a point), but GCC seems to have decided
otherwise for certain uint64_t's. (Looking at Table 3...) It
even gives you two registers for a return value: enough for a
string or an array. And if the backend/ABI weren't up for it,
it would be theoretically possible to have the frontend to
lower strings (dynamic arrays) and small structs into their
components before function calls and then also insert code on
the other side to cast them back into their original form. I'm
not sure if anyone would want to write it, though. o.O
Right, I think at some point one should fix the backend. C
programs would also benefit from it when passing structs as
arguments. However in C it is more common to just pass pointers
and they go into registers. I guess this is why I never noticed
before that struct passing is needlessly expensive.
Getting from pointer-length to string might be pretty easy:
string foo = ptr[0 .. len];
Ah cool! I did know about array slicing, but wasn't aware that it
works on pointers, too.
It ended up being a little more complicated than I thought it
would be. Hope I didn't ruin the fun. ;)
https://pastebin.com/y6e9mxre
Thanks :) I'll have to look into that more closely. But this is
the kind of stuff that I hope to make use of in the future on the
embedded CPU. But for now I cannot use it yet because I don't
have phobos and druntime in my toolchain right now... just naked
D.
Also, that part where you mentioned a 64-bit integer being
passed as a pair of registers made me start to wonder if unions
could be (ab)used to juke the ABI:
https://pastebin.com/eGfZN0SL
Thanks for suggesting! I tried, and the union works as well, i.e.
the function args are registered. But I noticed another thing
about all workarounds so far:
Even if calls are inlined and arguments end up on the stack, the
linker puts code of the wrapper function in my final binary event
if it is never explicitly called. So until I find a way to strip
of uncalled functions from the binary (not sure the linker can do
it), the workarounds don't solve the size problem. But they still
make the code run faster.