On Thu, Jun 5, 2008 at 4:31 PM, H.J. Lu <[EMAIL PROTECTED]> wrote:
> Hi,
>
> x86-64 psABI defines
>
> typedef struct
> {
>  unsigned int gp_offset;
>  unsigned int fp_offset;
>  void *overflow_arg_area;
>  void *reg_save_area;
> } va_list[1];
>
> for variable argument list. "va_list" is used to access variable argument
> list:
>
> void
> bar (const char *format, va_list ap)
> {
>  if (va_arg (ap, int) != 0)
>    abort ();
> }
>
> void
> foo(char *fmt, ...)
> {
>  va_list ap;
>  va_start (fmt, ap);
>  bar (fmt, ap);
>  va_end (ap);
> }
>
> foo and bar may be compiled with different compilers. We have to keep
> the current layout for va_list so that we can mix va_list codes compiled
> with AVX and non-AVX compilers. We need to extend the variable argument
> handling in the x86-64 psABI to support passing __m256/__m256d/__m256i
> on the variable argument list. We propose 2 ways to extend the register
> save area to add 256bit AVX registers support:
>
> 1. Extend the register save area to put upper 128bit at the end.
>  Pros:
>    Aligned access.
>    Save stack space if 256bit registers are used.
>  Cons
>    Split access. Require more split access beyond 256bit.
>
> 2. Extend the register save area to put full 265bit YMMs at the end.
> The first DWORD after the register save area has the offset of
> the extended array for YMM registers. The next DWORD has the
> element size of the extended array. Unaligned access will be used.
>  Pros:
>    No split access.
>    Easily extendable beyond 256bit.
>    Limited unaligned access penalty if stack is aligned at 32byte.
>  Cons:
>    May require store both the lower 128bit and full 256bit register
>    content. We may avoid saving the lower 128bit if correct type
>    is required when accessing variable argument list, similar to int
>    vs. double.
>    Waste 272 byte on stack when 256bit registers are used.
>    Unaligned load and store.
>
> We should agree on one approach to ensure compatibility between
> different compilers.
>
> Personally, I prefer #2 for its simplicity. Does anyone else have a
> preference?

If you want to mix AVX and non-AVX code then you need a way to
detect if AVX information was saved at runtime.  What is it in those
both cases?

If you don't want to mix AVX and non-AVX code then basically you
can declare the ABIs incompatible anyway?

There is also a third option of passing AVX values by reference.

For simplicity I would also prefer 2) - after all we don't need to fill
in the XMM area / the AVX area if the value is unused.

Richard.

Reply via email to