testsuite/ChangeLog:
>
> * gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c:
> Reposition operand 3's value into instruction's defined accept range.
^^
Remove these two white spaces.
Should be OK with these ChangeLog style issues fixed.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
/configure ...
which will taint the test suite with -fhardened.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Wed, 2024-03-13 at 10:24 +0800, Xi Ruoyao wrote:
> return TARGET_EXPLICIT_RELOCS
> - ? "pcalau12i\t$r4,%%desc_pc_hi20(%1)\n\
> - \taddi.d\t%2,$r0,%%desc_pc_lo12(%1)\n\
> - \tlu32i.d\t%2,%%desc64_pc_lo20(%1)\n\
> - \tlu52i.d\t%2,%2,%%desc64_pc_hi12(%1)\n
On Wed, 2024-03-13 at 11:06 +0800, mengqinggang wrote:
>
> 在 2024/3/13 上午6:15, Xi Ruoyao 写道:
> > On Tue, 2024-03-12 at 17:20 +0800, mengqinggang wrote:
> > > +(define_insn "@got_load_tls_desc"
> > > + [(set (match_operand:P 0 &q
On Wed, 2024-03-13 at 06:56 +0800, Xi Ruoyao wrote:
> On Wed, 2024-03-13 at 06:15 +0800, Xi Ruoyao wrote:
> > > +(define_insn "@got_load_tls_desc"
> > > + [(set (match_operand:P 0 "register_operand" "=r")
>
> Hmm, and it looks like we shou
On Wed, 2024-03-13 at 06:15 +0800, Xi Ruoyao wrote:
> > +(define_insn "@got_load_tls_desc"
> > + [(set (match_operand:P 0 "register_operand" "=r")
Hmm, and it looks like we should use (reg:P 4) instead of match_operand
here, because the instruction do
> + ? "pcalau12i\t$r4,%%desc_pc_hi20(%1)\n\
> + \taddi.d\t%2,$r0,%%desc_pc_lo12(%1)\n\
> + \tlu32i.d\t%2,%%desc64_pc_lo20(%1)\n\
> + \tlu52i.d\t%2,%2,%%desc64_pc_hi12(%1)\n\
> + \tadd.d\t$r4,$r4,%2\n\
> + \tld.d\t$r1,$r4,%%desc_ld(%1)\n\
> + \tjir
On Thu, 2024-03-07 at 21:07 +0800, chenglulu wrote:
>
> 在 2024/3/7 下午8:52, Xi Ruoyao 写道:
> > It should be better to extend the expected value before the ll/sc loop
> > (like what LLVM does), instead of repeating the extending in each
> > iteration. Something like:
>
pect, operands[4],
+ operands[6]));
+}
rtx compare = operands[1];
if (operands[3] != const0_rtx)
It produces:
slli.w $r4,$r4,0
1:
ll.w$r14,$r3,0
bne $r14,$r4,2f
or $r15,$zero,$r12
sc.w$r15,$r3,0
beqz$r15,1b
b 3f
2:
dbar0b10100
3:
for the test case and the compiled test case runs successfully. I've
not done a full bootstrap yet though.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
{ dg-options "-O2 -mcmodel=normal -mexplicit-relocs -mno-relax" } */
> > +/* { dg-final { scan-assembler-not "R_LARCH_RELAX" { target tls_native } }
> > } */
i.e. -mno-relax is used compiling this test case, and the compiled
assembly code should not contain R_LARCH_RELAX.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Loops on named vector register are not vectorized (see comment 11 of
PR113622), so the these test cases have been failing for a while.
Rewrite them using check-function-bodies to remove hard coding register
names. A barrier is needed to always load the first operand before the
second operand.
gcc
The psABI allows using s9 as an alias of r22.
gcc/ChangeLog:
* config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add
s9 as an alias of r22.
---
v1 -> v2: Add a test case.
Ok for trunk?
gcc/config/loongarch/loongarch.h | 1 +
gcc/testsuite/gcc.target/l
Recently I've fixed two wrong FP vector negate implementation which
caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To
prevent a similar issue from happening again, add a test case.
Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS
(with MSA), LoongArch (with
int 1)))]
> ""
> - "slti\t%0,%.,%1"
> + "slt\t%0,%.,%1"
> [(set_attr "type" "slt")
> (set_attr "mode" "")])
Hmm, this define_insn seems never really used or it would generate
something like "sltu
So allowing const_imm12_operand here
makes no benefit.
> ""
> - "slti\t%0,%.,%1"
> + "slt%i1\t%0,%.,%1"
> [(set_attr "type" "slt")
> (set_attr "mode" "")])
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Thu, 2024-02-29 at 15:09 +0800, Xi Ruoyao wrote:
> Recently I've fixed two wrong FP vector negate implementation which
> caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To
> prevent a similar issue from happening again, add a test case.
>
> Tested on x8
The psABI allows using s9 as an alias of r22.
gcc/ChangeLog:
* config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add
s9 as an alias of r22.
---
Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk?
gcc/config/loongarch/loongarch.h | 1 +
1 file changed, 1
In Binutils we need to make IE to LE relaxation only allowed when there
is an R_LARCH_RELAX after R_LARCH_TLE_IE_PC_{HI20,LO12} so an invalid
"partial" relaxation won't happen with the extreme code model. So if we
are emitting %ie_pc_{hi20,lo12} in a non-extreme code model, emit an
R_LARCH_RELAX t
Recently I've fixed two wrong FP vector negate implementation which
caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To
prevent a similar issue from happening again, add a test case.
Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS
(with MSA), LoongArch (with
The vect_int_mod target selector is evaluated with the options in
DEFAULT_VECTCFLAGS in effect, but these options are not automatically
passed to tests out of the vect directories. So this test fails on
targets where integer vector modulo operation is supported but requiring
an option to enable, f
On Thu, 2024-02-29 at 14:08 +0800, Xi Ruoyao wrote:
> > + "TARGET_TLS_DESC"
> > + "la.tls.desc\t%0,%1"
>
> With -mexplicit-relocs=always we should emit %desc_pc_lo12 etc. instead
> of la.tls.desc. As we don't want to add too many code we can just
TARGET_EXPLICIT_RELOCS_ALWAS ? "......" : "la.tls.desc\t%0,%1"; }
> + [(set_attr "got" "load")
> + (set_attr "mode" "")])
We need (set_attr "length" "16") in this list as this actually expands
into 16 bytes.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
The specification of crc/crcc instructions is clear that the output is
sign-extended to GRLEN. Add a define_insn to tell the compiler this
fact and allow it to remove the unneeded sign extension on crc/crcc
output. As crc/crcc instructions are usually used in a tight loop,
this should produce a s
Introduce an iterator for UNSPEC_CRC and UNSPEC_CRCC to make the next
change easier.
gcc/ChangeLog:
* config/loongarch/loongarch.md (CRC): New define_int_iterator.
(crc): New define_int_attr.
(loongarch_crc_w__w, loongarch_crcc_w__w): Unify
into ...
(loonga
On Thu, 2024-02-22 at 19:09 +0800, chenglulu wrote:
>
> 在 2024/2/22 下午6:20, Xi Ruoyao 写道:
> > To improve Binutils compatibility we've had to backported relaxation
> > support. But if a user just updates to GCC 13.3 and sticks with
> > Binutils 2.41, there is no reaso
On Fri, 2024-02-23 at 11:37 +0800, chenglulu wrote:
>
> 在 2024/2/23 上午11:27, Xi Ruoyao 写道:
> > On Fri, 2024-02-23 at 11:16 +0800, chenglulu wrote:
> > > 在 2024/2/22 下午5:17, Xi Ruoyao 写道:
> > > > The gold linker has never been ported to LoongArch (and it seems
>
On Fri, 2024-02-23 at 11:16 +0800, chenglulu wrote:
>
> 在 2024/2/22 下午5:17, Xi Ruoyao 写道:
> > The gold linker has never been ported to LoongArch (and it seems
> > unlikely to be ported in the future as the new architectures are
> > focusing on lld and/or mold for fast link
To improve Binutils compatibility we've had to backported relaxation
support. But if a user just updates to GCC 13.3 and sticks with
Binutils 2.41, there is no reason to use -mno-explicit-relocs as the
default because we are turning off relaxation for Binutils 2.41 (it
lacks conditional branch rel
The gold linker has never been ported to LoongArch (and it seems
unlikely to be ported in the future as the new architectures are
focusing on lld and/or mold for fast linkers).
ChangeLog:
* configure.ac (ENABLE_GOLD): Remove loongarch*-*-* from target
list.
* configure: Re
On Tue, 2024-02-20 at 19:50 +0800, chenglulu wrote:
>
> 在 2024/2/20 下午7:31, Xi Ruoyao 写道:
> > On Tue, 2024-02-20 at 19:25 +0800, Xi Ruoyao wrote:
> > > On Tue, 2024-02-20 at 10:07 +0800, chenglulu wrote:
> > >
> > > > So I think that without wo
On Tue, 2024-02-20 at 19:25 +0800, Xi Ruoyao wrote:
> On Tue, 2024-02-20 at 10:07 +0800, chenglulu wrote:
>
> > So I think that without worrying about performance and ensuring that
> > there is no problem
> >
> > with binutils, I think we can ma
test failures due to "excessive
errors" if running the GCC test suite with these earlier GAS versions.
Maybe we'll have to add some autoconf-based probing for the linker
anyway?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2024-02-09 at 00:02 +0800, chenglulu wrote:
>
> 在 2024/2/7 上午12:23, Xi Ruoyao 写道:
> > Hi Lulu,
> >
> > I'm proposing to backport r14-4674 "LoongArch: Delete macro definition
> > ASM_OUTPUT_ALIGN_WITH_NOP." to releases/gcc-12 and releases/g
On Tue, 2024-02-06 at 17:55 +0800, Xi Ruoyao wrote:
> Recently I've fixed two wrong FP vector negate implementation which
> caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To
> prevent a similar issue from happening again, add a test case.
>
> Tested on x8
4-4674 into releases/gcc-12 and releases/gcc-13
then?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Recently I've fixed two wrong FP vector negate implementation which
caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To
prevent a similar issue from happening again, add a test case.
Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS
(with MSA), LoongArch (with
On Mon, 2024-02-05 at 09:56 +0800, YunQiang Su wrote:
> Xi Ruoyao 于2024年2月5日周一 02:01写道:
> >
> > We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is
> > wrong because -0.0 is not 0 - 0.0. This causes some Python tests to
> > fail when Pytho
We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is
wrong because -0.0 is not 0 - 0.0. This causes some Python tests to
fail when Python is built with MSA enabled.
Use the bnegi.df instructions to simply reverse the sign bit instead.
gcc/ChangeLog:
* config/mips/mips-msa
On Sun, 2024-02-04 at 11:19 +0800, chenglulu wrote:
>
> 在 2024/2/2 下午5:55, Xi Ruoyao 写道:
> > We call loongarch_symbol_insns with mode = MAX_MACHINE_MODE sometimes.
> > But in loongarch_symbol_insns:
> >
> > if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE
On Sun, 2024-02-04 at 11:20 +0800, chenglulu wrote:
>
> 在 2024/2/3 下午4:58, Xi Ruoyao 写道:
> > We expanded (neg x) to (minus const0 x) for LSX FP vectors, this is
> > wrong because -0.0 is not 0 - 0.0. This causes some Python tests to
> > fail when Python is built with L
On Fri, 2024-02-02 at 10:42 +0800, chenglulu wrote:
> LGTM!
>
> Thanks!
Pushed r14-8773.
> 在 2024/2/2 上午5:54, Xi Ruoyao 写道:
> > When bootstrapping GCC 14 --with-build-config=bootstrap-lto, an ODR
> > violation is detected:
> >
> > ../../gcc/config/loo
We expanded (neg x) to (minus const0 x) for LSX FP vectors, this is
wrong because -0.0 is not 0 - 0.0. This causes some Python tests to
fail when Python is built with LSX enabled.
Use the vbitrevi.{d/w} instructions to simply reverse the sign bit
instead. We are already doing this for LASX and n
We call loongarch_symbol_insns with mode = MAX_MACHINE_MODE sometimes.
But in loongarch_symbol_insns:
if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P (mode))
return 0;
And LSX_SUPPORTED_MODE_P is defined as:
#define LSX_SUPPORTED_MODE_P(MODE) \
(ISA_HAS_LSX \
When bootstrapping GCC 14 --with-build-config=bootstrap-lto, an ODR
violation is detected:
../../gcc/config/loongarch/loongarch-opts.cc:57: warning:
'abi_minimal_isa' violates the C++ One Definition Rule [-Wodr]
57 | abi_minimal_isa[N_ABI_BASE_TYPES][N_ABI_EXT_TYPES];
../../gcc/con
On Thu, 2024-02-01 at 14:55 +0100, Jakub Jelinek wrote:
> On Thu, Feb 01, 2024 at 01:42:03PM +, Jonathan Yong wrote:
> > On 2/1/24 13:06, Xi Ruoyao wrote:
> > > On Thu, 2024-02-01 at 14:01 +0100, Jakub Jelinek wrote:
> > > > On Thu, Feb 01, 2024 at 12:45:3
" PRIu64 ")\n",
Should use HOST_WIDE_INT_PRINT_UNSIGNED instead of PRIu64.
>(unsigned HOST_WIDE_INT) (sizeof (IRA_INT_TYPE)
> * allocated_words_num),
>(unsigned HOST_WIDE_INT) (sizeof (IRA_INT_TYPE)
>
at.
You need to wait until the PR is accepted by the libffi maintainers.
Frankly I don't know what libffi maintainers are busy on and I'm
frustrated as well (having a MIPS patch unreviewed there for a month)
but this is the procedure :(.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Sat, 2024-01-27 at 18:02 +0800, Xi Ruoyao wrote:
> On Sat, 2024-01-27 at 11:15 +0800, chenglulu wrote:
> >
> > 在 2024/1/26 下午6:57, Xi Ruoyao 写道:
> > > On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
> > > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道:
> > >
On Sat, 2024-01-27 at 11:15 +0800, chenglulu wrote:
>
> 在 2024/1/26 下午6:57, Xi Ruoyao 写道:
> > On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
> > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道:
> > > > On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote:
> > >
On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
>
> 在 2024/1/26 下午4:49, Xi Ruoyao 写道:
> > On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote:
> > > v3 -> v4:
> > > 1. Add macro support for TLS symbols
> > > 2. Added support for loading __get_t
turn "la.tls.gd\t%0,%2,%1";
> + case SYMBOL_TLSLDM:
> + return "la.tls.ld\t%0,%2,%1";
> +
> + default:
> + gcc_unreachable ();
> + }
> +}
> + "&& REG_P (operands[1]) && find_reg_note (insn, REG_UNUSED, operands[2]) !=
> 0"
> + [(set (match_dup 0) (match_dup 1))]
> + ""
> + [(set_attr "mode" "DI")
> + (set_attr "length" "5")])
Should be 20, in bytes.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
n "la.tls.le\t%0,%1";
> + case SYMBOL_TLS_IE:
> + return "la.tls.ie\t%0,%1";
> + case SYMBOL_TLSLDM:
> + return "la.tls.ld\t%0,%1";
> + case SYMBOL_TLSGD:
> + return "la.tls.gd\t%0,%1";
/* snip */
> + default:
eme TLS GD/LD with -mexplicit-relocs=auto.
I've rebased and attached the patch to fix the bad split in -mexplicit-
relocs={always,auto} -mcmodel=extreme on top of this series. I've not
tested it seriously though (only tested the added and modified test
cases).
--
Xi Ruoyao
School of Aero
On Thu, 2024-01-25 at 08:48 +0800, chenglulu wrote:
>
> 在 2024/1/24 上午3:36, Xi Ruoyao 写道:
> > On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:
> > > > > The failure of this test case was because the compiler believes that
> > > > > two
> > &g
On Wed, 2024-01-24 at 19:08 +0800, chenxiaolong wrote:
> At 19:00 +0800 on Wednesday, 2024-01-24, Xi Ruoyao wrote:
> > On Wed, 2024-01-24 at 18:32 +0800, chenxiaolong wrote:
> > > On 20:09 +0800 on Tuesday, 2024-01-23, Xi Ruoyao wrote:
> > > > The vect_int_mod target
On Wed, 2024-01-24 at 18:32 +0800, chenxiaolong wrote:
> On 20:09 +0800 on Tuesday, 2024-01-23, Xi Ruoyao wrote:
> > The vect_int_mod target selector is evaluated with the options in
> > DEFAULT_VECTCFLAGS in effect, but these options are not automatically
> > passed to
n __inline float
> __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> __frecipe_s (float _1)
> {
> - __builtin_loongarch_frecipe_s ((float) _1);
> + return (float) __builtin_loongarch_frecipe_s ((float) _1);
I don't think the (float) conversion is needed.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
k only
papers over the same issue caused spec2006 failure. I tried a bootstrap
with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS
commented out, and there is no more spurious "note: non-delegitimized
UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things.
I feel that this hook is still written in a buggy way, so maybe removing
it will solve the spec2017 issue.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
The vect_int_mod target selector is evaluated with the options in
DEFAULT_VECTCFLAGS in effect, but these options are not automatically
passed to tests out of the vect directories. So this test fails on
targets where integer vector modulo operation is supported but requiring
an option to enable, f
When building GCC with --enable-default-ssp, the stack protector is
enabled for got-load.C, causing additional GOT loads for
__stack_chk_guard. So mem/u will be matched more than 2 times and the
test will fail.
Disable stack protector to fix this issue.
gcc/testsuite:
* g++.target/loong
On Tue, 2024-01-23 at 10:37 +0800, chenglulu wrote:
> LGTM!
>
> Thanks!
Pushed v2 as attached. The only change is in the comment: Qinggang told
me TLE LE relaxation actually *requires* explicit relocs.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian Univer
Binutils 2.42 supports TLS LD/GD relaxation which requires the assembler
macro.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_explicit_relocs_p):
If la_opt_explicit_relocs is EXPLICIT_RELOCS_AUTO, return false
for SYMBOL_TLS_LDM and SYMBOL_TLS_GD.
(loon
gister_operand" "=r")
(unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
With this the buggy REG_UNUSED notes were gone. But it then prevented
the CSE when loading the address of __tls_get_addr (i.e. if we address
10 TLE_LD symbols in a function it would emit 10 instance
;t understand the purpose of adding
> '-fno-tree-vectorize' here.
I don't think -fno-tree-vectorize will make a difference here. This
test case uses __attribute__((vector_size(...))) explicitly so the
vector operation will be used even if -fno-tree-vectorize.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote:
>
> 在 2024/1/13 下午9:05, Xi Ruoyao 写道:
> > 在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
> > > 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> > > > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
> > > >
> > >
ite/lib/dg-options.exp
> +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> @@ -337,6 +337,7 @@ proc add_options_for_libatomic { flags } {
> || ([istarget powerpc*-*-*] && [check_effective_target_ilp32])
> || [istarget riscv*-*-*]
> || ([istarget sparc*-*-linux-gnu] &
On Tue, 2024-01-16 at 12:58 +0800, Xi Ruoyao wrote:
> On Tue, 2024-01-16 at 10:57 +0800, chenxiaolong wrote:
> > 在 2024-01-15一的 15:50 +0800,Xi Ruoyao写道:
> > > On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote:
> > > > At 14:42 +0800 on the first day of 2024-01-15,
On Tue, 2024-01-16 at 14:16 +0800, chenglulu wrote:
>
>
> 在 2024/1/16 下午1:34, Xi Ruoyao 写道:
> > Ping.
> >
> > On Fri, 2023-12-15 at 20:56 +0800, Xi Ruoyao wrote:
> > > We don't allow SImode in FCC, so constraint z is never really us
Ping.
On Fri, 2023-12-15 at 20:56 +0800, Xi Ruoyao wrote:
> We don't allow SImode in FCC, so constraint z is never really used
> here.
>
> gcc/ChangeLog:
>
> * config/loongarch/loongarch.md (movsi_internal): Remove
> constraint z.
> ---
>
> Bootst
On Tue, 2024-01-16 at 10:57 +0800, chenxiaolong wrote:
> 在 2024-01-15一的 15:50 +0800,Xi Ruoyao写道:
> > On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote:
> > > At 14:42 +0800 on the first day of 2024-01-15, Xi Ruoyao wrote:
> > > > On Mon, 2024-01-15 at
On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote:
> At 14:42 +0800 on the first day of 2024-01-15, Xi Ruoyao wrote:
> > On Mon, 2024-01-15 at 14:32 +0800, YunQiang Su wrote:
> > > Xi Ruoyao wrote at 12:11pm on Monday, January
> > > 15, 2024:
> > >
On Mon, 2024-01-15 at 14:32 +0800, YunQiang Su wrote:
> Xi Ruoyao 于2024年1月15日周一 12:11写道:
> >
> > On Mon, 2024-01-15 at 09:29 +0800, chenxiaolong wrote:
> > > At 21:13 +0800 on Saturday, 2024-01-13, Xi Ruoyao wrote:
> > > > At 15:28 +0800 on Saturday 2024-01-1
On Mon, 2024-01-15 at 09:29 +0800, chenxiaolong wrote:
> At 21:13 +0800 on Saturday, 2024-01-13, Xi Ruoyao wrote:
> > At 15:28 +0800 on Saturday 2024-01-13, chenxiaolong wrote:
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/pr104992.c: Added addition
821731 100644
> --- a/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
> +++ b/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
> @@ -2,6 +2,7 @@
> ! { dg-require-effective-target vect_double }
> ! { dg-options "-O3 --param vect-max-peeling-for-alignment=0
> -fpredi
在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
>
> 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
> >
> > > > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
> > > > we n
-shared --enable-bootstrap
> --enable-checking=release
> $ make BOOT_FLAGS="-mcmodel=extreme"
>
> What did I do wrong?:-(
BOOT_CFLAGS, not BOOT_FLAGS :).
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
can-assembler-times "slli.w\t\\\$r\[0-9\]+,\\\$r\[0-9\]+,0"
> 0 } } */
Use scan-assembler-not instead of scan-assembler-times ... 0.
Otherwise LGTM.
> #include
> #define my_min(x, y) ((x) < (y) ? (x) : (y))
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
uot;")])
>
> +(define_insn "*nsi_internal"
> + [(set (match_operand:SI 0 "register_operand" "=r")
> + (neg_bitwise:SI
> + (not:SI (match_operand:SI 1 "register_operand" "r"))
> + (match_operand:SI 2 "register_operand" "r")))]
> + "TARGET_64BIT"
> + "n\t%0,%2,%1"
> + [(set_attr "type" "logical")
> + (set_attr "mode" "SI")])
>
> ;;
> ;;
> @@ -3167,7 +3210,6 @@ (define_expand "condjump"
> (label_ref (match_operand 1))
> (pc)))])
>
> -
>
> ;;
> ;;
> @@ -3967,10 +4009,13 @@ (define_insn "bytepick_w_"
> (define_insn "bytepick_w__extend"
> [(set (match_operand:DI 0 "register_operand" "=r")
> (sign_extend:DI
> - (ior:SI (lshiftrt (match_operand:SI 1 "register_operand" "r")
> - (const_int ))
> - (ashift (match_operand:SI 2 "register_operand" "r")
> - (const_int bytepick_w_ashift_amount)]
> + (subreg:SI
> + (ior:DI (subreg:DI (lshiftrt
> + (match_operand:SI 1 "register_operand" "r")
> + (const_int )) 0)
> + (subreg:DI (ashift
> + (match_operand:SI 2 "register_operand" "r")
> + (const_int bytepick_w_ashift_amount)) 0)) 0)))]
> "TARGET_64BIT"
> "bytepick.w\t%0,%1,%2,"
> [(set_attr "mode" "SI")])
> diff --git a/gcc/testsuite/gcc.target/loongarch/sign-extend-bitwise.c
> b/gcc/testsuite/gcc.target/loongarch/sign-extend-bitwise.c
> new file mode 100644
> index 000..5753ef69db2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/sign-extend-bitwise.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mabi=lp64d -O2" } */
> +/* { dg-final { scan-assembler-not "slli.w\t\\\$r\[0-9\]+,\\\$r\[0-9\]+,0" }
> } */
> +
> +struct pmop
> +{
> + unsigned int op_pmflags;
> + unsigned int op_pmpermflags;
> +};
> +unsigned int PL_hints;
> +
> +struct pmop *pmop;
> +void
> +Perl_newPMOP (int type, int flags)
> +{
> + if (PL_hints & 0x0010)
> + pmop->op_pmpermflags |= 0x0001;
> + if (PL_hints & 0x0004)
> + pmop->op_pmpermflags |= 0x0800;
> + pmop->op_pmflags = pmop->op_pmpermflags;
> +}
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
_rtx (DImode);
> + emit_insn (gen_addsi3_extended (t, operands[1], operands[2]));
AFAIK if !TARGET_64BIT a DImode should be actually a pair of hardware
registers, but addsi3_extended don't output such a pair so this seems
invalid...
> + t = gen_lowpart (SImode, t);
> +
On Fri, 2024-01-05 at 20:45 +0800, chenglulu wrote:
>
> 在 2024/1/5 下午7:55, Xi Ruoyao 写道:
> > On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
> > > On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> > > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > > &
HAS_DIV32 etc. in the code base? It seems some of them are not
replaced.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
> On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> >
> > 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > > bool
> > > > loongarch_ex
fective_target_loongarch_sx] ||" because SIMD
requires hard float.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
>
> 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > bool
> > > loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> > > {
> > > +
e several hours trying to implement this...
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
_effective_target_s390_vx])
> > +|| ([istarget riscv*-*-*]
> > + && [check_effective_target_riscv_v])
>
> Unless I'm missing something, we have copysign in the scalar
> floating-point ISAs as well. So I think this should be
>
> || ([istarget riscv*-*-*]
> && [check_effective_target_hard_float])
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
x27;s to get as much testing
> as possible. Assuming the rest is ACK'd for the trunk we'll put it into
> the list of optimizations enabled by -O2.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Thu, 2024-01-04 at 11:58 +0800, chenglulu wrote:
>
> 在 2024/1/4 上午11:51, Xi Ruoyao 写道:
> > On Wed, 2023-12-27 at 16:46 +0800, Lulu Cheng wrote:
> > > +(define_insn "movdi_pcrel64"
> > > + [(set (match_operand:DI 0 "register_oper
match_operand:DI 2 "register_operand "=&r"))]
And use
gen_movdi_pcrel64 (operands[0], operands[1], gen_reg_rtx(DImode))
in expand.
> + "TARGET_64BIT"
> + "la.local %0,$r15,%1"
> + [(set_attr "mode" "DI")
> + (set_attr "length" "5")])
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Wed, 2024-01-03 at 16:24 +0800, chenglulu wrote:
> LGTM!
>
> Thanks!
Pushed r14-6890.
FWIW sometimes tree optimizer still fails to emit .reduc_f{max,min} or
it emits them sub-optimally. I've commented in PR112457 but maybe I
should've created a new ticket...
> 在 2024
We already had smin/smax RTL pattern using vfmin/vfmax instructions.
But for smin/smax, it's unspecified what will happen if either operand
contains any NaN operands. So we would not vectorize the loop with
-fno-finite-math-only (the default for all optimization levels expect
-Ofast).
But, LoongA
On Sat, 2023-12-30 at 20:25 +0800, Xi Ruoyao wrote:
> On Sat, 2023-12-30 at 12:15 +, Richard Sandiford wrote:
> > This shouldn't be necessary. The test does:
> >
> > for (int i = 0; i < n; i += 2)
> > {
> > x0 = __builtin_fmin (x0, ptr[i
but not reduc_fmin_scal_*?
> If so, we probably need a new target selector for fmin/fmax reduction.
Let me try if the [x]vf{min,max} instructions are IEEE-conform. They've
still not released the volume 2 of the instruction manual so I can only
try...
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
gcc/ChangeLog:
* config/loongarch/loongarch.md (bstrins__for_ior_mask):
For the condition, remove unneeded trailing "\" and move "&&" to
follow GNU coding style. NFC.
---
Pushed as obvious.
gcc/config/loongarch/loongarch.md | 4 ++--
1 file changed, 2 insertions(+), 2 d
Pushed v4 as attached, with the format issues fixed and a minor
adjustment in the commit message ("define_insn_and_split" is changed to
"define_insn_and_rewrite" to match the actual change).
On Fri, 2023-12-29 at 19:55 +0800, Xi Ruoyao wrote:
> On Fri, 2023-12-29 at 15:57
+ op = XEXP (op, 0);
> > + return symbolic_pcrel_operand (op, Pmode) ||
> > +symbolic_pcrel_offset_operand (op, Pmode);
> > +})
> > +
> >
> Symbol '||' It shouldn't be at the end of the line.
Indeed.
>
> + return symbolic_pcrel_operand (op, Pmode)
> + || symbolic_pcrel_offset_operand (op, Pmode);
>
> Others LGTM.
> Thanks!
>
> /* snip */
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
The problem with peephole2 is it uses a naive sliding-window algorithm
and misses many cases. For example:
float a[1];
float t() { return a[0] + a[8000]; }
is compiled to:
la.local$r13,a
la.local$r12,a+32768
fld.s $f1,$r13,0
fld.s $f0,$r12,-768
; >nelt,
> +
> rperm));
> + tmp = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
Likewise.
> + emit_move_insn (tmp, sel);
> + break;
> + case E_V8SFmode:
> + sel = ge
ymbol_ref:DI ("*.LANCHOR0") [flags 0x182])) [0 S1
> A8]))) "volatile.c":5:11 -1
> (nil))
>
> The volatile property of the mem here is gone, so the test fails.
Phew. I guess I couldn't reproduce it because I have Jeff's ext-dce
patch in my local repo, which removed the zero_extend...
I'll rework this patch.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
The GCC internal doc says:
X might be a pseudo-register or a 'subreg' of a pseudo-register,
which could either be in a hard register or in memory. Use
'true_regnum' to find out; it will return -1 if the pseudo is in
memory and the hard register number if it is in a register.
101 - 200 of 982 matches
Mail list logo