在 2024/1/4 下午5:05, chenglulu 写道:
在 2024/1/4 下午12:05, Xi Ruoyao 写道:
On Thu, 2024-01-04 at 11:58 +0800, chenglulu wrote:
在 2024/1/4 上午11:51, Xi Ruoyao 写道:
On Wed, 2023-12-27 at 16:46 +0800, Lulu Cheng wrote:
+(define_insn "movdi_pcrel64"
+ [(set (match_operand:DI 0 "r
Pushed to r14-7085...r14-7088
在 2024/1/8 上午9:14, Yang Yujie 写道:
This patchset performs some code cleanup, and is bootstrapped and regtested
on loongarch64-linux-gnu.
Changes from v1 -> v2:
* Replaced all TARGET_ macros from .opt.
* Fixed definition of ISA_HAS_LAMCAS.
Yang Yujie (4):
在 2024/1/12 下午7:42, Xi Ruoyao 写道:
在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
we need a target hook to tell the generic code
UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
see millions lines
在 2024/1/13 下午9:05, Xi Ruoyao 写道:
在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
在 2024/1/12 下午7:42, Xi Ruoyao 写道:
在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
we need a target hook to tell the generic code
Sorry, I've been busy with something else these two days. I don't think
there's anything wrong with the code,
but I need to test the spec.:-)
在 2023/12/21 下午7:56, Xi Ruoyao 写道:
Ping :).
On Tue, 2023-12-12 at 14:47 +0800, Xi Ruoyao wrote:
The problem with peephole2 is it uses a naive
在 2023/12/19 下午8:37, Xi Ruoyao 写道:
On Tue, 2023-12-19 at 19:04 +0800, Lulu Cheng wrote:
+(define_insn "@add_tls_le_relax"
+ [(set (match_operand:P 0 "register_operand" "=r")
+ (unspec:P [(match_operand:P 1 "register_operand" "r")
+ (match_operand:P 2 "register_operand"
在 2023/11/23 下午3:11, Xi Ruoyao 写道:
On Thu, 2023-11-23 at 14:35 +0800, chenglulu wrote:
Hi,
I don’t quite understand this part. Is it because define_insn would be
duplicated with the above implementation,
so define_insn_and_split is used?
Yes, but if you think duplicating the above
在 2023/11/23 下午3:31, chenglulu 写道:
在 2023/11/23 下午3:11, Xi Ruoyao 写道:
On Thu, 2023-11-23 at 14:35 +0800, chenglulu wrote:
Hi,
I don’t quite understand this part. Is it because define_insn
would be
duplicated with the above implementation,
so define_insn_and_split is used?
Yes
在 2023/11/23 下午5:02, Xi Ruoyao 写道:
On Thu, 2023-11-23 at 16:13 +0800, chenglulu wrote:
The fix_truncv4sfv4si2 template is indeed called when debugging with
gdb.
So I think we can use define_expand here.
The problem is cases where we want to combine an rint call with float-
to-int conversion
LGTM.
Thanks.
在 2023/11/20 上午8:47, Xi Ruoyao 写道:
Remove unnecessary UNSPECs and make the [x]vrotr[i] instructions useful
with GNU vectors and auto vectorization.
gcc/ChangeLog:
* config/loongarch/lsx.md (bitimm): Move to ...
(UNSPEC_LSX_VROTR): Remove.
(lsx_vrotr_):
I tested it and it was fine. I never knew this could be used like this.
Thank you!
在 2023/11/20 上午8:47, Xi Ruoyao 写道:
No functional change, just a cleanup.
gcc/ChangeLog:
* config/loongarch/loongarch.md (lrint_allow_inexact): Remove.
(2): Check if
== UNSPEC_FTINT
在 2023/11/23 下午4:58, Xi Ruoyao 写道:
On Thu, 2023-11-23 at 16:23 +0800, chenglulu wrote:
I tested it and it was fine. I never knew this could be used like
this.
I remember when I wrote r13-3920 I tried this but failed. Maybe
something has been improved in machine description parser
LGTM.
Thanks!
在 2023/11/20 上午8:47, Xi Ruoyao 写道:
Removes unnecessary UNSPECs and make the muh instructions useful with
GNU vectors or auto vectorization.
gcc/ChangeLog:
* config/loongarch/simd.md (muh): New code attribute mapping
any_extend to smul_highpart or umul_highpart.
在 2023/11/29 下午3:12, Xi Ruoyao 写道:
On Mon, 2023-11-20 at 08:47 +0800, Xi Ruoyao wrote:
The [1/5] patch is the PR112578 fix at
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637097.html.
It has been changed to remove the nearbyint pattern (because nearbyint
should not raise FE_INEXACT
在 2023/11/20 上午8:47, Xi Ruoyao 写道:
The usage LSX and LASX frint/ftint instructions had some problems:
1. These instructions raises FE_INEXACT, which is not allowed with
-fno-fp-int-builtin-inexact for most C2x section F.10.6 functions
(the only exceptions are rint, lrint, and llrint).
Pushed to r14-6070.
在 2023/11/29 上午9:53, Xi Ruoyao 写道:
On Tue, 2023-11-28 at 15:56 +0800, Li Wei wrote:
In the r14-5547 commit, C[LT]Z_DEFINED_VALUE_AT_ZERO were defined at
the same time, but in fact, CLZ_DEFINED_VALUE_AT_ZERO has already been
defined, so remove the duplicate definition.
Pushed to r14-6072.
在 2023/11/28 下午3:38, Li Wei 写道:
In LoongArch, the vector popcount has corresponding instructions, while
the scalar does not. Currently, the scalar popcount is calculated
through a loop, and the value of a non-power of two needs to be iterated
several times, so the vector
在 2023/12/2 下午6:15, Xi Ruoyao 写道:
On Sat, 2023-12-02 at 16:14 +0800, Lulu Cheng wrote:
/* snip */
diff --git a/gcc/config/loongarch/loongarch-opts.cc
b/gcc/config/loongarch/loongarch-opts.cc
index b5836f198c0..6861642a98d 100644
--- a/gcc/config/loongarch/loongarch-opts.cc
+++
在 2023/11/29 下午5:44, Xi Ruoyao 写道:
On Tue, 2023-11-28 at 15:39 +0800, Li Wei wrote:
For vector constant extract-{even/odd} permutation replace the default
[x]vshuf instruction combination with [x]vilv{l/h} instruction, which
can reduce instructions and improves performance.
gcc/ChangeLog:
在 2023/11/24 上午10:39, Xi Ruoyao 写道:
On Thu, 2023-11-23 at 18:03 +, Joseph Myers wrote:
The rint functions indeed don't set errno (don't have domain or range
errors, at least if you ignore the option for signaling NaNs arguments to
be domain errors - which is in TS 18661-1, but not what
在 2023/11/24 下午4:42, Xi Ruoyao 写道:
On Fri, 2023-11-24 at 16:36 +0800, chenglulu wrote:
在 2023/11/24 下午4:26, Xi Ruoyao 写道:
On Fri, 2023-11-24 at 16:01 +0800, chenglulu wrote:
I only saw lrint llrint in n2310 with this description:
F7.12.9.5
"The lrint and llrint functions
在 2023/11/24 下午4:26, Xi Ruoyao 写道:
On Fri, 2023-11-24 at 16:01 +0800, chenglulu wrote:
I only saw lrint llrint in n2310 with this description:
F7.12.9.5
"The lrint and llrint functions round their argument to the nearest
integer value, rounding
according to the current rounding dire
在 2023/11/23 下午8:24, Xi Ruoyao 写道:
On Thu, 2023-11-23 at 17:14 +0800, chenglulu wrote:
When I look at this code and compare it to our scalar implementation, it
seems
that our scalar implementation still lacks an "lround".
Should be "lroundeven". We don't have an in
在 2023/11/24 下午6:30, Xi Ruoyao 写道:
On Fri, 2023-11-24 at 17:46 +0800, chenglulu wrote:
It's just that I'm confused that the description of rint in n2310,
including Joseph's email,
all say that rint will not set errno, but linux-man says "which might
set errno to ERANGE" .
Annex F h
在 2023/12/2 下午9:41, Xi Ruoyao 写道:
On Sat, 2023-12-02 at 20:44 +0800, chenglulu wrote:
@@ -657,12 +658,18 @@ abi_str (struct loongarch_abi abi)
strlen (loongarch_abi_base_strings[abi.base]));
else
{
+ /* This situation has not yet occurred, so in order
Pushed to r14-6303 and r14-6304.
在 2023/12/5 上午10:30, Lulu Cheng 写道:
1. Rebase Xi Ruoyao's patch a to the latest commit.
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636798.html
2. remove the #if
!defined(IN_LIBGCC2) && !defined(IN_TARGET_LIBS) && !defined(IN_RTS)
guards in
Pushed to r14-6308.
在 2023/11/17 下午5:00, Jiahao Xu 写道:
This patch adds support for xorsign pattern to scalar fp and vector. With the
new expands, uniformly using vector bitwise logical operations to handle
xorsign.
On LoongArch64, floating-point registers and vector registers share the same
Pushed to r14-6316.
在 2023/11/29 上午11:16, Jiahao Xu 写道:
For [x]vshuf instructions, if the index value in the selector exceeds 63, it
triggers
undefined behavior on LA464, but not on LA664. To ensure compatibility of these
two
tests on both LA464 and LA664, we have modified both tests to
Pushed to r14-6311...r14-6315.
在 2023/12/6 下午3:04, Jiahao Xu 写道:
LoongArch V1.1 adds support for approximate instructions, which are utilized
along with additional
Newton-Raphson steps implement single precision floating-point division, square
root and reciprocal
square root operations for
Pushed to r14-6317.
在 2023/11/29 上午11:18, Jiahao Xu 写道:
loongarch_expand_vec_cond_mask_expr generates 'subreg's of 'subreg's, which are
not supported
in gcc, it causes an ICE:
ice.c:55:1: error: unrecognizable insn:
55 | }
| ^
(insn 63 62 64 8 (set (reg:V4DI 278)
在 2023/12/7 下午8:20, Xi Ruoyao 写道:
There seems no real reason to require -mexplicit-relocs=always for
-mcmodel=extreme or model attribute. As the linker does not know how to
relax a 3-operand la.local or la.global pseudo instruction, just emit
explicit relocs for SYMBOL_PCREL64, and under
Pushed to r14-5864.
在 2023/11/23 上午11:05, Guo Jie 写道:
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_split_plus_constant):
avoid left shift of negative value -0x8000.
---
gcc/config/loongarch/loongarch.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Pushed to r14-5863.
在 2023/11/18 下午2:59, Guo Jie 写道:
For the following immediate load operation in
gcc/testsuite/gcc.target/loongarch/imm-load1.c:
long long r = 0x0101010101010101;
Before this patch:
lu12i.w $r15,16842752>>12
ori $r15,$r15,257
在 2024/1/26 下午6:57, Xi Ruoyao 写道:
On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
在 2024/1/26 下午4:49, Xi Ruoyao 写道:
On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote:
v3 -> v4:
1. Add macro support for TLS symbols
2. Added support for loading __get_tls_addr symbol addr
在 2024/1/27 下午7:11, Xi Ruoyao 写道:
On Sat, 2024-01-27 at 18:02 +0800, Xi Ruoyao wrote:
On Sat, 2024-01-27 at 11:15 +0800, chenglulu wrote:
在 2024/1/26 下午6:57, Xi Ruoyao 写道:
On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
在 2024/1/26 下午4:49, Xi Ruoyao 写道:
On Fri, 2024-01-26 at 15:37
在 2024/1/27 下午10:03, chenglulu 写道:
在 2024/1/27 下午7:11, Xi Ruoyao 写道:
On Sat, 2024-01-27 at 18:02 +0800, Xi Ruoyao wrote:
On Sat, 2024-01-27 at 11:15 +0800, chenglulu wrote:
在 2024/1/26 下午6:57, Xi Ruoyao 写道:
On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
在 2024/1/26 下午4:49, Xi Ruoyao
Pushed to r14-8722.
在 2024/1/26 下午4:41, Li Wei 写道:
We found that when only 128-bit vectorization was enabled, 549.fotonik3d_r
failed to vectorize effectively. For this reason, we adjust the cost of
128-bit vector_stmt that match the multiply-add pattern to facilitate 128-bit
vectorization.
The
Pushed to r14-8723.
在 2024/1/24 下午5:19, Jiahao Xu 写道:
gcc/ChangeLog:
* config/loongarch/larchintrin.h
(__frecipe_s): Update function return type.
(__frecipe_d): Ditto.
(__frsqrte_s): Ditto.
(__frsqrte_d): Ditto.
gcc/testsuite/ChangeLog:
*
Pushed to r14-8717...r14-8721.
在 2024/1/29 下午4:21, Lulu Cheng 写道:
When cmodel=extreme, since the symbol address is obtained through four
instructions,
errors may occur in some cases during linking. Xi Ruoyao fixes this problem.
Pushed to r14-8716.
在 2024/1/30 下午3:55, Lulu Cheng 写道:
Modify address calculation logic from (((a x C) + fp) + offset) to ((fp +
offset) + a x C).
Thereby modifying the register dependencies and optimizing the code.
The value of C is 2 4 or 8.
The following is the assembly code before and
LGTM!
Thanks!
在 2024/2/2 上午5:54, Xi Ruoyao 写道:
When bootstrapping GCC 14 --with-build-config=bootstrap-lto, an ODR
violation is detected:
../../gcc/config/loongarch/loongarch-opts.cc:57: warning:
'abi_minimal_isa' violates the C++ One Definition Rule [-Wodr]
57 |
Ping?
在 2024/1/30 上午10:09, Lulu Cheng 写道:
From: chenguoqi
libsanitizer/ChangeLog:
* configure.tgt: Enable tsan and lsan for loongarch64.
* tsan/Makefile.am: Add tsan_rtl_loongarch64.S to
EXTRA_libtsan_la_SOURCES.
* tsan/Makefile.in: Regenerate.
---
在 2024/2/3 下午4:58, Xi Ruoyao 写道:
We expanded (neg x) to (minus const0 x) for LSX FP vectors, this is
wrong because -0.0 is not 0 - 0.0. This causes some Python tests to
fail when Python is built with LSX enabled.
Use the vbitrevi.{d/w} instructions to simply reverse the sign bit
instead. We
在 2024/2/2 下午5:55, Xi Ruoyao 写道:
We call loongarch_symbol_insns with mode = MAX_MACHINE_MODE sometimes.
But in loongarch_symbol_insns:
if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P (mode))
return 0;
And LSX_SUPPORTED_MODE_P is defined as:
#define
在 2024/2/2 下午6:01, Jakub Jelinek 写道:
On Tue, Jan 30, 2024 at 10:09:51AM +0800, Lulu Cheng wrote:
From: chenguoqi
libsanitizer/ChangeLog:
* configure.tgt: Enable tsan and lsan for loongarch64.
* tsan/Makefile.am: Add tsan_rtl_loongarch64.S to
EXTRA_libtsan_la_SOURCES.
This
Pushed to r14-8784.
在 2024/2/2 上午9:42, Li Wei 写道:
This FAIL was introduced from r14-6908. The reason is that when merging
constant vector permutation implementations, the 128-bit matching situation
was not fully considered. In fact, the expansion of 128-bit vectors after
merging only supports
在 2024/1/23 下午4:04, Xi Ruoyao 写道:
On Tue, 2024-01-23 at 10:37 +0800, chenglulu wrote:
LGTM!
Thanks!
Pushed v2 as attached. The only change is in the comment: Qinggang told
me TLE LE relaxation actually *requires* explicit relocs.
I think one of the reasons is also because we cannot
Pushed to r14-8344.
在 2024/1/17 上午9:24, chenxiaolong 写道:
gcc/ChangeLog:
* doc/sourcebuild.texi: Add attributes for keywords.
---
gcc/doc/sourcebuild.texi | 20
1 file changed, 20 insertions(+)
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
在 2024/1/19 下午4:51, chenglulu 写道:
在 2024/1/19 下午1:46, Xi Ruoyao 写道:
On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote:
Virtual register 1479 will be used in insn 2744, but register 1479
was
assigned the REG_UNUSED attribute in the previous instruction.
The attached file is the wrong file
LGTM!
Thanks!
在 2024/1/23 下午7:35, Xi Ruoyao 写道:
When building GCC with --enable-default-ssp, the stack protector is
enabled for got-load.C, causing additional GOT loads for
__stack_chk_guard. So mem/u will be matched more than 2 times and the
test will fail.
Disable stack protector to fix
在 2024/1/24 下午5:36, Li Wei 写道:
We found that when only 128-bit vectorization was enabled, 549.fotonik3d_r
failed to vectorize effectively. For this reason, we adjust the cost of
128-bit vector_stmt that match the multiply-add pattern to facilitate 128-bit
vectorization.
The experimental
Pushed to r14-8447.
在 2024/1/16 上午10:23, Jiahao Xu 写道:
For below pattern, can be treated as a simple move because floating point
and vector share a common register on loongarch64.
(set (reg/v:SF 32 $f0 [orig:93 res ] [93])
(vec_select:SF (reg:V8SF 32 $f0 [115])
(parallel [
Pushed to r14-8446.
在 2024/1/16 上午10:32, Jiahao Xu 写道:
Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the
short-circuit operation instead of the non-short-circuit operation.
SPEC2017 performance evaluation shows 1% performance improvement for fprate
GEOMEAN and no
在 2024/1/26 下午3:32, Richard Biener 写道:
On Fri, Jan 26, 2024 at 7:23 AM chenxiaolong wrote:
gcc/testsuite/ChangeLog:
OK
Pushed to r14-8445.
Thank you everyone for your review!
* gcc.dg/signbit-2.c: Added additional "-mlsx" compilation options.
*
Pushed to r14-8444.
在 2024/1/24 下午5:44, Li Wei 写道:
We found that in the spec17 521.wrf program, some loop invariant code generated
from single-precision floating-point approximate division calculation failed to
propose a loop. This is because the pseudo-register that stores the
intermediate
在 2024/1/26 下午4:52, Xi Ruoyao 写道:
On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote:
+(define_insn "@load_tls"
[(set (match_operand:P 0 "register_operand" "=r")
(unspec:P
[(match_operand:P 1 "symbolic_operand" "")]
- UNSPEC_TLS_GD))]
+
在 2024/1/26 下午4:49, Xi Ruoyao 写道:
On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote:
v3 -> v4:
1. Add macro support for TLS symbols
2. Added support for loading __get_tls_addr symbol address using call36.
3. Merge template got_load_tls_{ld/gd/le/ie}.
4. Enable explicit reloc for
在 2024/1/26 下午4:59, chenglulu 写道:
在 2024/1/26 下午4:52, Xi Ruoyao 写道:
On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote:
+(define_insn "@load_tls"
[(set (match_operand:P 0 "register_operand" "=r")
(unspec:P
[(match
在 2024/1/26 下午6:57, Xi Ruoyao 写道:
On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
在 2024/1/26 下午4:49, Xi Ruoyao 写道:
On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote:
v3 -> v4:
1. Add macro support for TLS symbols
2. Added support for loading __get_tls_addr symbol addr
在 2024/1/24 上午3:36, Xi Ruoyao 写道:
On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:
The failure of this test case was because the compiler believes that two
(UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
same result, but this isn't true because the result depends
在 2024/1/24 下午5:58, Jiahao Xu 写道:
在 2024/1/24 下午5:48, Xi Ruoyao 写道:
On Wed, 2024-01-24 at 17:19 +0800, Jiahao Xu wrote:
gcc/ChangeLog:
* config/loongarch/larchintrin.h
(__frecipe_s): Update function return type.
(__frecipe_d): Ditto.
(__frsqrte_s): Ditto.
(__frsqrte_d):
Jiahao:
Note that the LoongArch 'a' in the title needs to be capitalized.
I modified this patch and incorporated it first.
在 2024/1/24 下午5:19, Jiahao Xu 写道:
It is incorrect to use vld/vori to implement the vec_concatz because when
the LSX
instruction is used to update the value of the
Pushed to r14-8412.
在 2024/1/23 上午11:54, Lulu Cheng 写道:
TLS gd ld and ie type symbols will generate corresponding GOT entries,
so non-zero offsets cannot be generated.
The address of TLS le type symbol+addend is not implemented in binutils,
so non-zero offset is not generated here for the time
Pushed to r14-8414.
在 2024/1/24 下午5:19, Jiahao Xu 写道:
It is incorrect to use vld/vori to implement the vec_concatz because when
the LSX
instruction is used to update the value of the vector register, the upper 128
bits of
the vector register will not be zeroed.
gcc/ChangeLog:
*
在 2024/2/2 下午6:01, Jakub Jelinek 写道:
On Tue, Jan 30, 2024 at 10:09:51AM +0800, Lulu Cheng wrote:
From: chenguoqi
libsanitizer/ChangeLog:
* configure.tgt: Enable tsan and lsan for loongarch64.
* tsan/Makefile.am: Add tsan_rtl_loongarch64.S to
EXTRA_libtsan_la_SOURCES.
This
在 2024/2/5 上午1:01, Xi Ruoyao 写道:
I have a question. I see that you often add compilation options in
BOOT_CFLAGS.
I also want to test it. Do you have a recommended set of compilation
options?
When I build a compiler for my system I use
{BOOT_{C,CXX,LD}FLAGS,{C,CXX,LD}FLAGS_FOR_TARGET}="-O3
在 2023/11/15 上午5:52, Xi Ruoyao 写道:
This is isomorphic to the LLVM changes [1-2].
On LoongArch, the LL and SC instructions has memory barrier semantics:
- LL: +
- SC: +
But the compare and swap operation is allowed to fail, and if it fails
the SC instruction is not executed, thus the
在 2023/11/15 下午7:38, Xi Ruoyao 写道:
Pushed r14-5486.
/* snip */
* gcc.target/loongarch/cas-acquire.c: New test.
This test fails with GCC 12/13 on LA664, and it indicates a correctness
issue. May I backport this patch to 12/13 as well?
I think we can backport.
Thanks!
Pushed to r14-5568.
在 2023/11/17 下午7:09, Xi Ruoyao 写道:
On Fri, 2023-11-17 at 16:33 +0800, Lulu Cheng wrote:
Lulu Cheng (3):
LoongArch: Add LA664 support.
LoongArch: Implement atomic operations using LoongArch1.1
instructions.
LoongArch: atomic_load and atomic_store are
I have no problem, thanks for fixing the bug
在 2023/11/18 上午4:43, Xi Ruoyao 写道:
Superseds
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636795.html.
Requires
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636946.html.
Changes:
- Rebase on top of "Add LoongarchV1.1
Pushed to r14-5567.
在 2023/11/16 下午3:27, Lulu Cheng 写道:
When compiling with '-mcmodel=medium', the function call is made through
'pcaddu18i+jirl' if binutils supports call36, otherwise the
native implementation 'pcalau12i+jirl' is used.
gcc/ChangeLog:
* config.in: Regenerate.
pushed to r14-5601
backport to r13-8085 and r12-9995.
r12 and r13 simultaneously synchronized the patch that changed '/lib64'
to '/lib'.
在 2023/11/18 上午11:15, Lulu Cheng 写道:
Use no suffix at all in the musl dynamic linker name for hard
float ABI. Use -sf and -sp suffixes in musl dynamic
在 2023/11/20 上午9:51, Xi Ruoyao 写道:
On Mon, 2023-11-20 at 09:09 +0800, chenglulu wrote:
在 2023/11/19 上午1:24, Xi Ruoyao 写道:
On Sat, 2023-11-18 at 16:16 +0800, chenglulu wrote:
Pushed to r14-5567.
在 2023/11/16 下午3:27, Lulu Cheng 写道:
When compiling with '-mcmodel=medium', the function call
在 2023/11/19 上午1:24, Xi Ruoyao 写道:
On Sat, 2023-11-18 at 16:16 +0800, chenglulu wrote:
Pushed to r14-5567.
在 2023/11/16 下午3:27, Lulu Cheng 写道:
When compiling with '-mcmodel=medium', the function call is made through
'pcaddu18i+jirl' if binutils supports call36, otherwise the
native
Pushed to r14-5547.
在 2023/11/17 上午10:38, Li Wei 写道:
The LoongArch has defined ctz and clz on the backend, but if we want GCC
do CTZ transformation optimization in forwprop2 pass, GCC need to know
the value of c[lt]z at zero, which may be beneficial for some test cases
(like spec2017
LGTM.
Thanks.
在 2023/11/14 上午4:07, Xi Ruoyao 写道:
With LSX or LASX, copysign (x[i], -1) (or any negative constant) can be
vectorized using [x]vbitseti.{w/d} instructions to directly set the
signbits.
Inspired by Tamar Christina's "AArch64: Handle copysign (x, -1) expansion
efficiently"
在 2023/11/17 下午8:31, Xi Ruoyao 写道:
On Fri, 2023-11-17 at 16:33 +0800, Lulu Cheng wrote:
Define ISA_BASE_LA64V110, which represents the base instruction set defined in
LoongArch1.1.
Support the configure setting --with-arch =la664, and support
-march=la664,-mtune=la664.
gcc/ChangeLog:
在 2023/11/17 下午12:55, Xi Ruoyao 写道:
On Fri, 2023-11-17 at 10:41 +0800, chenglulu wrote:
Hi,
Thank you very much for the modification, but I think we need to support
la664 with the configuration items of configure.
I'll add it.
I also defined ISA_BASE_LA64V110 to represent the LoongArch1.1
Hi,
Thank you very much for the modification, but I think we need to support
la664 with the configuration items of configure.
I also defined ISA_BASE_LA64V110 to represent the LoongArch1.1
instruction set, what do you think?
在 2023/11/16 下午9:18, Xi Ruoyao 写道:
Loongson 3A6000 processor
Pushed to r14-5544
在 2023/11/16 下午8:31, Jiahao Xu 写道:
These tests fail when they are first added,this patch adjusts the
scan-assembler-times
to fix them.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/vector/lasx/lasx-vcond-1.c: Adjust assembler
times.
*
Pushed to r14-5545.
在 2023/11/16 下午4:44, Jiahao Xu 写道:
Based on SPEC2017 performance evaluation results, it's better to make them equal
to the cost of unaligned store/load so as to avoid odd alignment peeling.
gcc/ChangeLog:
* config/loongarch/loongarch.cc
在 2023/11/14 下午5:55, Xi Ruoyao 写道:
On Tue, 2023-11-14 at 17:45 +0800, Lulu Cheng wrote:
+ /* When function calls are made through call36, t0 register will be
+implicitly modified, so '-fno-ipa-ra' needs to be set here. */
case CMODEL_MEDIUM:
+ if
在 2023/11/14 下午4:34, Xi Ruoyao 写道:
On Tue, 2023-11-14 at 10:26 +0800, chenglulu wrote:
Hi,
* Before calling this template, the function get_memmodel is called to
process memmodel, which has a piece of code:
/* Workaround for Bugzilla 59448. GCC doesn't track consume properly
在 2023/11/14 下午4:50, Xi Ruoyao 写道:
Ping. I've tested this with Binutils 2.41 and 2.41.50.202311xx several
times so it should be OK.
On Mon, 2023-11-06 at 15:50 +0800, Xi Ruoyao wrote:
/* snip */
Bootstrapped and regtested on loongarch64-linux-gnu twice: once with
Binutils 2.41, another
在 2023/11/14 上午7:18, Xi Ruoyao 写道:
/* snip */
(define_insn "mem_thread_fence_1"
[(set (match_operand:BLK 0 "" "")
(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))
(match_operand:SI 1 "const_int_operand" "")] ;; model
""
- "dbar\t0")
+ {
+enum memmodel model =
在 2023/11/11 下午6:58, Xi Ruoyao 写道:
fld and fst have same address mode as ld.w and st.w, so the same
optimization as r14-4851 should be applied for them too.
gcc/ChangeLog:
* config/loongarch/loongarch.md (LD_AT_LEAST_32_BIT): New mode
iterator.
(ST_ANY): New mode
在 2023/11/12 上午9:00, Xi Ruoyao 写道:
GCC internal says:
'subreg's of 'subreg's are not supported. Using
'simplify_gen_subreg' is the recommended way to avoid this problem.
Unfortunately loongarch_expand_vec_cond_mask_expr might create nested
subreg under certain circumstances,
在 2023/11/19 下午3:01, Xi Ruoyao 写道:
The vec_perm expander was wrongly defined. GCC internal says:
Operand 3 is the “selector”. It is an integral mode vector of the same
width and number of elements as mode M.
With this mistake, the generic code manages to work around and it ends
up creating
在 2024/3/7 下午12:05, mengqinggang 写道:
Hi,
Thanks, this patch is LGTM.
I don't have a problem either.
Thanks.
在 2024/3/7 上午10:56, Xi Ruoyao 写道:
On Thu, 2024-03-07 at 10:43 +0800, mengqinggang wrote:
Hi,
Whether to add an option to control the generation of R_LARCH_RELAX,
similar to as
在 2024/3/18 下午5:34, Xi Ruoyao 写道:
We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named
arguments and there is nothing to advance, but that is not the case
for (...) functions returning by hidden reference which have one such
artificial argument. This is causing
在 2024/3/13 下午9:03, Xi Ruoyao 写道:
If this insn is really used, we'll have something like
slti $r4,$r0,$r5
in the code. The assembler will reject it because slti wants 2
register operands and 1 immediate operand. But we've not got any bug
report for this, indicating this define_insn is
Pushed to r14-9486.
在 2024/3/14 上午9:26, Chenghui Pan 写道:
The behavior of non-zero unused bits in xvpermi.q instruction's
third operand is undefined on LoongArch, according to our
discussion (https://github.com/llvm/llvm-project/pull/83540),
we think that keeping original insn operand as
Pushed to r14-9562...r14-9564.
在 2024/3/15 上午9:30, Chenghui Pan 写道:
Changes from v1: Some correction about ChangeLog format.
There's some unused/redundant definitions inside LoongArch target support
codes, these patches make a simple cleanup. Regression test passed.
Chenghui Pan (3):
在 2024/3/7 下午8:52, Xi Ruoyao 写道:
It should be better to extend the expected value before the ll/sc loop
(like what LLVM does), instead of repeating the extending in each
iteration. Something like:
I wanted to do this at first, but it didn't work out.
But then I thought about it, and there
在 2024/3/1 下午5:39, mengqinggang 写道:
Thanks, I try to send a new version patch next week.
在 2024/2/29 下午2:08, Xi Ruoyao 写道:
On Thu, 2024-02-29 at 09:42 +0800, mengqinggang wrote:
Generate la.tls.desc macro instruction for TLS descriptors model.
la.tls.desc expand to
pcalau12i $a0,
Pushed to r14-9352.
在 2024/3/6 下午4:54, chenxiaolong 写道:
In simd_correctness_check.h, the role of the macro ASSERTEQ_64 is to check the
result of the passed vector values for the 64-bit data of each array element.
It turns out that it uses the abs() function to check only the lower 32 bits
of
Pushed to r14-9351.
在 2024/3/6 上午9:19, Yang Yujie 写道:
gcc/ChangeLog:
* config.gcc: Add a case for loongarch*-*-linux-musl*.
* config/loongarch/linux.h: Disable the multilib-compatible
treatment for *musl* targets.
* config/loongarch/musl.h: New file.
---
This test case is so cleverly designed!
I have no problem. Thank you!
在 2024/3/5 下午9:00, Xi Ruoyao 写道:
Loops on named vector register are not vectorized (see comment 11 of
PR113622), so the these test cases have been failing for a while.
Rewrite them using check-function-bodies to remove hard
Pushed to r14-9407.
在 2024/3/7 上午9:12, Lulu Cheng 写道:
If the hardware does not support LAMCAS, atomic_compare_and_swapsi needs to be
implemented through "ll.w+sc.w". In the implementation of the instruction
sequence,
it is necessary to determine whether the two registers are equal.
Since
Pushed to r14-9408.
在 2024/3/7 上午9:50, Lulu Cheng 写道:
When the value of the macro DEFAULT_CFLAGS is set to '-ansi -pedantic-errors',
regname-s9-fp.c will test to fail. To solve this problem, add the compilation
option '-Wno-pedantic -std=gnu90' to this test case.
gcc/testsuite/ChangeLog:
201 - 300 of 322 matches
Mail list logo