Sure, of course. Actually, I've just had another look and I've realised
I did screw up the ARM stuff - I was trying to maintain the existing
behaviour since I didn't have a test system, but it looks like I messed
it up. I think it's just a matter of deleting the first #ifdef
TCC_ARM_EABI block from gfunc_call though. I'll try and get a look at it
later but I can't promise much since I have zero experience of ARM
development.
Anyway, onto the stuff I fixed:
Calling convention stuff:
* Various x86 and x86-64 calling conventions pack structure return
values into registers when they are small enough. I added gfunc_sret
which determines whether that is the case and prevents an extra
pointer parameter being added to receive the return value. It also
returns the type used to pass the return value, so that tccgen.c can
save it to the stack. Perhaps in retrospect it would have been
better to move return value handling into target specific code
generators, but anyway, it works.
o x86-64: rules are rather complicated, see classify_x86_64_*
functions and the SysV ABI. I had to add a register mode
RC_QRET, analogous to RC_IRET, since a pair of doubles is
returned in XMM0:XMM1. This in turn also means I have added
support for XMM1-5 as general registers since since it was no
more work than XMM1 alone. XMM6-7 aren't caller-saved on Win64
so I didn't make them available for calculation, although I did
add them to the enumeration. Some cases also return 16-byte
structures in RAX:RDX.
o Win32: structures of 8-bytes or less are returned in EAX or EAX:EDX.
o Win64: Structures of 8 bytes or less are returned in RAX.
* Similarly, function arguments may be passed in registers rather than
on the stack.
o The SysV x86-64 ABI has rather complicated rules which I've
implemented in classify_x86_64_* functions.
o Win64 rules are somewhat simpler but (as far as I can tell,
because MSs documentation isn't up to much) basically decide
what to do based on whether the argument is larger or smaller
than 8 bytes. Each argument gets 8 bytes of space in registers
or on the stack; if the argument itself is 8 bytes or less it is
passed in that space, otherwise it is passed by reference.
o Win32 rules are the same as Linux-x86 except that small
structures are returned in EAX:EDX.
* x86-64 long double handling: added extra padding so that long
doubles are aligned on 16-byte boundaries. There was already code to
align the stack before the function call, but this actually has to
be done each time a 16-byte aligned argument is encountered as well.
* x86-64 varargs: I modified __builtin_va_arg_types to use the
classify_x86_64_* functions, and added an alignment parameter to
__va_arg so that 16-byte aligned long doubles can be handled.
* Win64 varargs: I added __builtin_va_start on this platform since I
couldn't see a way around it. If the last parameter (the second
argument to va_start) on Win64 is larger than 8 bytes, it will be
passed by reference, and va_start needs to get the address of the
reference, which would require some sort of &(&x) type expression,
which is obviously invalid C. I also redefined va_args.
CMake build system: I added this primarily to make Win64 builds a lot
easier since they then don't need a custom MSYS setup, just 64-bit gcc
and mingw32-make which are available together. It should also work on
other platforms where CMake is available. I had to shift tcclib.h out of
the include/ directory to get some of the tests to work because there
isn't a way in CMake to copy tcclib.h into the test directory, and other
headers in include/ interfere with GCC compilation.
Out-of-tree builds: there were a lot of small issues using the Makefiles
for out of tree builds. They should now be self-updating (modifying the
makefiles updates the out-of-tree copy). I've added $(top_srcdir)/... a
lot to get file references right, and updated include paths where necessary.
Variable length array stuff: VLAs were implemented using alloca() but
the memory wasn't freed until the end of the function. This prevents
VLAs from being used in a loop, for instance. This is pretty
straightfoward to fix when goto and labels are not in use: just track
whether the stack pointer has been modified and if so reset it at the
end of each block. Goto handling is tricky because in a normal compiler
we'd just work out what the stack pointer should be at the destination
and set it before jumping. TCC can't do that because it generates code
in a single pass, so what I did instead is that a goto with a VLA in
scope saves the stack pointer and then resets it to its value when the
outermost VLA was created. A label with a VLA in scope then reloads the
appropriate stack pointer. Test cases are in vla_test.c.
This does mean that in certain cases memory allocated by alloca will be
freed when not strictly necessary, i.e. in:
char data[n];
char *p = alloca(n);
goto label;
label:
At "label" p will have been freed. But otherwise VLAs and alloca
shouldn't interfere since the VLA code doesn't do anything unless a VLA
is in scope.
On 29/04/13 22:12, grischka wrote:
James Lyon wrote:
Thanks, however I have to admit that I'm not particularly interested
in TCC in the long term.
... and I basically fixed the things that prevented that working.
Maybe you could at some point in time between "too fresh to be sure"
and "too old to care" give some summary about what now you actually
did.
Unless that is sufficiently clear already from your commits and
commit comments in which case please just ignore this message.
Thanks,
--- grischka
_______________________________________________
Tinycc-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
_______________________________________________
Tinycc-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel