Hi, > -----Original Message----- > From: Hongtao Liu [mailto:crazy...@gmail.com] > Sent: Friday, May 29, 2020 11:24 AM > To: H.J. Lu <hjl.to...@gmail.com> > Cc: Yangfei (Felix) <felix.y...@huawei.com>; gcc-patches@gcc.gnu.org; > Uros Bizjak <ubiz...@gmail.com>; Jakub Jelinek <ja...@redhat.com>; > Richard Sandiford <richard.sandif...@arm.com> > Subject: Re: [PATCH PR95254] aarch64: gcc generate inefficient code with > fixed sve vector length >
Snip... > > > > > > This is due to define_subst magic. The generators automatically > > > create a vec_merge form of the instruction based on the information > > > in the <mode_*> attributes. > > > > > > AFAICT the rtl above is for the line-125 instruction, which looks ok. > > > The problem is the line-126 instruction, since vcvtps2ph doesn't > > > AIUI allow zero masking. > > > > > zero masking is not allowed for mem_operand here, but available for > register_operand. > there's something wrong in the pattern, we need to fix it. > (define_insn "<mask_codefor>avx512f_vcvtps2ph512<mask_name>" Thanks for confirming that :-) > > > > The "mask" define_subst allows both zeroing and merging, so I guess > > > this means that the pattern should either be using a different > > > define_subst, or should be enforcing merging in some other way. > > > Please could one of the x86 devs take a look? > > > > > > > Hongtao, can you take a look? > > > > Thanks. > > > > > > -- > > H.J. > > BTW, i failed to build gcc when apply pr95254-v4.txt. > > gcc configure: > > Using built-in specs. > COLLECT_GCC=./gcc/xgcc > Target: x86_64-pc-linux-gnu > Configured with: ../../gcc/gnu-toolchain/gcc/configure > --enable-languages=c,c++,fortran --disable-bootstrap Thread model: posix > Supported LTO compression algorithms: zlib gcc version 11.0.0 20200528 > (experimental) (GCC) > > host on x86_64 rel8. Yes, I tried your configure and reproduced the error. Thanks for pointing this out. The patch can pass bootstrap on x86_64 with the following configure options. Surprised to see that it failed to build with your configure. Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/home/yangfei/gcc-hacking/install-gcc/libexec/gcc/x86_64-pc-linux-gnu/11/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-git/configure --prefix=/home/yangfei/gcc-hacking/install-gcc --enable-languages=c,c++,objc,obj-c++,fortran,lto --enable-shared --enable-threads=posix --enable-checking=yes --disable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --enable-plugin --enable-initfini-array --without-isl --disable-libmpx --enable-gnu-indirect-function Thread model: posix Supported LTO compression algorithms: zlib gcc version 11.0.0 20200526 (experimental) (GCC) Felix