The implementation fails to handle this test case properly:

typedef double __attribute__((vector_size(32))) v4df;

void use1(double);

__attribute__((noipa)) double use(double)
{
        register double x asm("f24") = 114.514;
        __asm__("" : "+f" (x));
        return x;
}

void test(void)
{
        register v4df x asm("f24") = {1, 2, 3, 4};
        __asm__("" : "+f" (x));
        use(x[1]);
        use1(x[3]);
}

Here use() attempts to save and restore f24, but it uses fst.d/fld.d,
clobbering the high 192 bits of xr24.  Now test() passes a wrong value
of x[3] to use1().

Note that saving and restoring f24 with xvst/xvld in use() won't really
fix the issue because in real life use() can be in another translation
unit (or even a shared library) compiled with -mno-lsx.  So it seems we
need to tell the compiler "a function call may clobber the high bits of
a vector register even if the corresponding floating-point register is
saved".  I'm not sure how to accomplish this...

On Tue, 2023-08-15 at 09:05 +0800, Chenghui Pan wrote:
> This is an update of:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626194.html
> 
> This version of patch set only introduces some small simplications of
> implementation. Because I missed the size limitation of mail size, the
> huge testsuite patches of v2 and v3 are not shown in the mail list.
> So,
> testsuite patches are splited from this patch set again and will be
> submitted 
> independently in the future.
> 
> Binutils-gdb introduced LSX/LASX support since 2.41 release:
> https://lists.gnu.org/archive/html/info-gnu/2023-07/msg00009.html
> 
> Brief history of patch set version:
> v1 -> v2:
> - Reduce usage of "unspec" in RTL template.
> - Append Support of ADDR_REG_REG in LSX and LASX.
> - Constraint docs are appended in gcc/doc/md.texi and ccomment block.
> - Codes related to vecarg are removed.
> - Testsuite of LSX and LASX is added in v2. (Because of the size
> limitation of
>   mail list, these patches are not shown)
> - Adjust the loongarch_expand_vector_init() function to reduce
> instruction 
>   output amount.
> - Some minor implementation changes of RTL templates.
> 
> v2 -> v3:
> - Revert vabsd/xvabsd RTL templates to unspec impl.
> - Resolve warning in gcc/config/loongarch/loongarch.cc when
> bootstrapping 
>   with BOOT_CFLAGS="-O2 -ftree-vectorize -fno-vect-cost-model -mlasx".
> - Remove redundant definitions in lasxintrin.h.
> - Refine commit info.
> 
> Lulu Cheng (6):
>   LoongArch: Add Loongson SX vector directive compilation framework.
>   LoongArch: Add Loongson SX base instruction support.
>   LoongArch: Add Loongson SX directive builtin function support.
>   LoongArch: Add Loongson ASX vector directive compilation framework.
>   LoongArch: Add Loongson ASX base instruction support.
>   LoongArch: Add Loongson ASX directive builtin function support.
> 
>  gcc/config.gcc                                |    2 +-
>  gcc/config/loongarch/constraints.md           |  131 +-
>  .../loongarch/genopts/loongarch-strings       |    4 +
>  gcc/config/loongarch/genopts/loongarch.opt.in |   12 +-
>  gcc/config/loongarch/lasx.md                  | 5122 ++++++++++++++++
>  gcc/config/loongarch/lasxintrin.h             | 5338
> +++++++++++++++++
>  gcc/config/loongarch/loongarch-builtins.cc    | 2686 ++++++++-
>  gcc/config/loongarch/loongarch-c.cc           |   18 +
>  gcc/config/loongarch/loongarch-def.c          |    6 +
>  gcc/config/loongarch/loongarch-def.h          |    9 +-
>  gcc/config/loongarch/loongarch-driver.cc      |   10 +
>  gcc/config/loongarch/loongarch-driver.h       |    2 +
>  gcc/config/loongarch/loongarch-ftypes.def     |  666 +-
>  gcc/config/loongarch/loongarch-modes.def      |   39 +
>  gcc/config/loongarch/loongarch-opts.cc        |   89 +-
>  gcc/config/loongarch/loongarch-opts.h         |    3 +
>  gcc/config/loongarch/loongarch-protos.h       |   35 +
>  gcc/config/loongarch/loongarch-str.h          |    3 +
>  gcc/config/loongarch/loongarch.cc             | 4586 +++++++++++++-
>  gcc/config/loongarch/loongarch.h              |  117 +-
>  gcc/config/loongarch/loongarch.md             |   56 +-
>  gcc/config/loongarch/loongarch.opt            |   12 +-
>  gcc/config/loongarch/lsx.md                   | 4481 ++++++++++++++
>  gcc/config/loongarch/lsxintrin.h              | 5181 ++++++++++++++++
>  gcc/config/loongarch/predicates.md            |  333 +-
>  gcc/doc/md.texi                               |   11 +
>  26 files changed, 28668 insertions(+), 284 deletions(-)
>  create mode 100644 gcc/config/loongarch/lasx.md
>  create mode 100644 gcc/config/loongarch/lasxintrin.h
>  create mode 100644 gcc/config/loongarch/lsx.md
>  create mode 100644 gcc/config/loongarch/lsxintrin.h
> 

-- 
Xi Ruoyao <xry...@xry111.site>
School of Aerospace Science and Technology, Xidian University

Reply via email to