d[ \\t] 694
>
>
> 2024-06-06 Roger Sayle
> Hongtao Liu
>
> gcc/ChangeLog
> * config/i386/i386-expand.cc (ix86_expand_args_builtin): Call
> fixup_modeless_constant before testing predicates. Only call
> copy_to_mode_reg on memory operands
Do r15-1050-gfcfce55c85f842ed843cbc4aabe744c6a004dead fix the failure?
On Thu, Jun 6, 2024 at 10:06 PM ci_notify--- via Gcc-regression
wrote:
>
> Dear contributor, our automatic CI has detected problems related to your
> patch(es). Please find some details below. If you have any questions,
>
https://gcc.gnu.org/g:b24f2954dbc13d85e9fb62e05a88e9df21e4d4f4
commit r15-1088-gb24f2954dbc13d85e9fb62e05a88e9df21e4d4f4
Author: liuhongt
Date: Fri Jun 7 09:29:24 2024 +0800
Add additional option --param max-completely-peeled-insns=200 for
power64*-*-*
gcc/testsuite/ChangeLog:
On Thu, Jun 6, 2024 at 2:39 PM Hongyu Wang wrote:
>
> Current target apxf check does not specify sub-features that assembler
> supports, so the check with older binutils will fail at assemble stage
> for new apx features like NF,CCMP or CFCMOV. Adjust the assembler check
> for latest apx
https://gcc.gnu.org/g:fcfce55c85f842ed843cbc4aabe744c6a004dead
commit r15-1050-gfcfce55c85f842ed843cbc4aabe744c6a004dead
Author: liuhongt
Date: Thu Jun 6 11:27:53 2024 +0800
Refine testcase for power10.
For power10, there're extra 3 REG_EQUIV notes with (fix:SI. to avoid
the
https://gcc.gnu.org/g:961dd0d635217c703a38c48903981e0d60962546
commit r15-1048-g961dd0d635217c703a38c48903981e0d60962546
Author: liuhongt
Date: Fri Apr 19 10:39:53 2024 +0800
Adjust rtx_cost for MEM to enable more simplication
For CONST_VECTOR_DUPLICATE_P in constant_pool, it is
https://gcc.gnu.org/g:7876cde25cbd2f026a0ae488e5263e72f8e9bfa0
commit r15-1047-g7876cde25cbd2f026a0ae488e5263e72f8e9bfa0
Author: liuhongt
Date: Fri Apr 19 10:29:34 2024 +0800
Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode.
When mask is (1 << (prec - imm)
On Wed, Jun 5, 2024 at 10:44 PM Jeff Law wrote:
>
>
>
> On 6/4/24 10:22 PM, liuhongt wrote:
> >> Can you add a testcase for this? I don't mind if it's x86 specific and
> >> does a bit of asm scanning.
> >>
> >> Also note that the context for this patch has changed, so it won't
> >> automatically
https://gcc.gnu.org/g:b05288d1f1e4b632eddf8830b4369d4659f6c2ff
commit r15-1022-gb05288d1f1e4b632eddf8830b4369d4659f6c2ff
Author: liuhongt
Date: Tue May 21 16:57:17 2024 +0800
Don't simplify NAN/INF or out-of-range constant for FIX/UNSIGNED_FIX.
According to IEEE standard, for
https://gcc.gnu.org/g:4d207044195b97ecb27c72a7dc987eb8b86644a0
commit r15-1003-g4d207044195b97ecb27c72a7dc987eb8b86644a0
Author: liuhongt
Date: Tue Jun 4 10:13:09 2024 +0800
Adjust testcase for -march=cascadelake
gcc/testsuite/ChangeLog:
PR target/115299
https://gcc.gnu.org/g:ac306de7d5100d3682eae2270995a9abbe19db38
commit r15-984-gac306de7d5100d3682eae2270995a9abbe19db38
Author: liuhongt
Date: Fri May 31 14:38:07 2024 +0800
Add some preference for floating point rtl ifcvt when sse4.1 is not
available
W/o TARGET_SSE4_1, it takes
On Wed, May 29, 2024 at 11:05 AM Haochen Jiang wrote:
>
> Hi all,
>
> Since AVX10 is the first major ISA introduced after AVX-512, we propose
> to add target_clones support for it.
>
> Although AVX10.1-256 won't cover 512-bit part of AVX512F, but since
> it is only for priority but not for
On Wed, May 29, 2024 at 1:11 PM Kong, Lingling wrote:
>
> Hi, compared with v2, these patches restored the original lea patten position
> and addressed hongtao's comment.
>
> APX NF(no flags) feature implements suppresses the update of status flags
> for arithmetic operations.
Ok for the patch
On Wed, May 15, 2024 at 4:21 PM Hongyu Wang wrote:
>
> The ccmp insn itself doesn't support fp compare, but x86 has fp comi
> insn that changes EFLAG which can be the scc input to ccmp. Allow
> scalar fp compare in ix86_gen_ccmp_first except ORDERED/UNORDERD
> compare which can not be identified
On Wed, May 15, 2024 at 4:24 PM Hongyu Wang wrote:
>
> APX CCMP feature implements conditional compare which executes compare
> when EFLAGS matches certain condition.
>
> CCMP introduces default flags value (dfv), when conditional compare does
> not execute, it will directly set the flags
On Tue, May 28, 2024 at 4:00 PM Hu, Lin1 wrote:
>
> Hi all,
>
> This patch aims to acheive EQ/NE comparison between avx512 kmask and -1
> by using kxortest with checking CF.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,-m64}. Ok for trunk?
Ok.
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
>
On Fri, May 31, 2024 at 10:58 AM Hanke Zhang via Gcc wrote:
>
> Hi,
> I've recently been trying to hand-write code to trigger automatic
> vectorization optimizations in GCC on Intel x86 machines (without
> using the interfaces in immintrin.h), but I'm running into a problem
> where I can't seem
https://gcc.gnu.org/g:3a873c0a7bc8183de95a6103b507101a25eed413
commit r15-932-g3a873c0a7bc8183de95a6103b507101a25eed413
Author: liuhongt
Date: Thu May 30 14:15:48 2024 +0800
Rename double_u with __double_u to avoid pulluting the namespace.
gcc/ChangeLog:
*
On Wed, May 29, 2024 at 5:00 PM Hu, Lin1 wrote:
>
> According to hongtao's suggestion, I support some trunc in mmx.md under
> x86-64-v3, and optimize ix86_expand_trunc_with_avx2_noavx512f.
Ok.
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
> PR 107432
> * config/i386/i386-expand.cc
https://gcc.gnu.org/g:b6c6d5abf0d31c936f50f8f9073c5e335b9e24b7
commit r15-920-gb6c6d5abf0d31c936f50f8f9073c5e335b9e24b7
Author: liuhongt
Date: Wed Feb 28 11:17:10 2024 +0800
Support vcond_mask_qiqi and friends.
gcc/ChangeLog:
* config/i386/sse.md (vcond_mask_):
https://gcc.gnu.org/g:ef27b91b62c3aa8841c02665dffa8914c742fd37
commit r15-919-gef27b91b62c3aa8841c02665dffa8914c742fd37
Author: liuhongt
Date: Tue Feb 27 15:34:57 2024 +0800
Don't reduce estimated unrolled size for innermost loop.
For the innermost loop, after completely loop
On Wed, May 29, 2024 at 4:56 PM Hu, Lin1 wrote:
>
> Exclude add TARGET_MMX_WITH_SSE, I merge two patterns.
Ok.
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
> PR target/107432
> * config/i386/mmx.md
> (VI2_32_64): New mode iterator.
> (mmxhalfmode): New mode atter.
> (mmxhalfmodelower):
On Thu, May 16, 2024 at 5:15 PM Hongyu Wang wrote:
>
> Richard Biener 于2024年5月16日周四 15:05写道:
>
> >
> > On Thu, May 16, 2024 at 8:25 AM Hongyu Wang wrote:
> > >
> > > Hi,
> > >
> > > In ix86_override_options_after_change, calls to ix86_default_align
> > > and ix86_recompute_optlev_based_flags
https://gcc.gnu.org/g:1d6199e5f8c1c08083eeb0279f71333234fe14ad
commit r15-882-g1d6199e5f8c1c08083eeb0279f71333234fe14ad
Author: liuhongt
Date: Mon Feb 19 13:57:24 2024 +0800
Reduce cost of MEM (A + imm).
For MEM, rtx_cost iterates each subrtx, and adds up the costs,
so for
https://gcc.gnu.org/g:c65002347e595cda8b15e59e734d209283faf2b6
commit r15-857-gc65002347e595cda8b15e59e734d209283faf2b6
Author: liuhongt
Date: Tue May 28 10:32:12 2024 +0800
Fix predicate mismatch between vfcmaddcph's define_insn and define_expand.
When I applied Roger's patch
On Mon, May 27, 2024 at 2:48 PM Hongtao Liu wrote:
>
> On Sat, May 18, 2024 at 4:10 AM Roger Sayle
> wrote:
> >
> >
> > Hi Hongtao,
> > Many thanks for the review, bug fixes and suggestions for improvements.
> > This revised version of the pa
On Sat, May 18, 2024 at 4:10 AM Roger Sayle wrote:
>
>
> Hi Hongtao,
> Many thanks for the review, bug fixes and suggestions for improvements.
> This revised version of the patch, implements all of your corrections. In
> theory
> the "ternlog idx" should guarantee that some operands are
On Tue, May 21, 2024 at 5:46 AM Alexander Monakov wrote:
>
>
> Hello!
>
> I looked at ternlog a bit last year, so I'd like to offer some drive-by
> comments. If you want to tackle them in a follow-up patch, or leave for
> someone else to handle, please let me know.
>
> On Fri, 17 May 2024, Roger
On Thu, May 23, 2024 at 2:38 PM Hu, Lin1 wrote:
>
> gcc/ChangeLog:
>
> PR target/107432
> * config/i386/mmx.md (truncv4hiv4qi2): New define_insn.
>
> gcc/testsuite/ChangeLog:
>
> PR target/107432
> * gcc.target/i386/pr107432-6.c: Add test.
> ---
> gcc/config/i386/mmx.md
On Mon, May 20, 2024 at 11:15 AM Hongtao Liu wrote:
>
> On Wed, May 15, 2024 at 11:30 AM Jiang, Haochen
> wrote:
> >
> > Also cc Honza and Richard since we touched generic tune.
> >
> > Thx,
> > Haochen
> >
> > > -Original Message-
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652231.html
Ok for this.
--
BR,
Hongtao
https://gcc.gnu.org/g:51f4b47c4f4f61fe31a7bd1fa80e08c2438d76a8
commit r15-814-g51f4b47c4f4f61fe31a7bd1fa80e08c2438d76a8
Author: liuhongt
Date: Fri May 24 09:49:08 2024 +0800
Fix typo in the testcase.
gcc/testsuite/ChangeLog:
PR target/114148
*
CC for review.
On Tue, May 21, 2024 at 1:12 PM liuhongt wrote:
>
> When mask is (1 << (prec - imm) - 1) which is used to clear upper bits
> of A, then it can be simplified to LSHIFTRT.
>
> i.e Simplify
> (and:v8hi
> (ashifrt:v8hi A 8)
> (const_vector 0xff x8))
> to
> (lshifrt:v8hi A 8)
>
>
On Thu, May 23, 2024 at 3:17 PM Hu, Lin1 wrote:
>
> > -Original Message-
> > From: Hongtao Liu
> > Sent: Thursday, May 23, 2024 2:42 PM
> > To: Hu, Lin1
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ;
> > ubiz...@gmail.com; rguent...@suse.de
>
On Thu, May 23, 2024 at 2:38 PM Hu, Lin1 wrote:
>
> gcc/ChangeLog:
>
> PR 107432
> * config/i386/i386-expand.cc (ix86_expand_trunc_with_avx2_noavx512f):
> New function for generate a series of suitable insn.
> * config/i386/i386-protos.h
On Wed, May 22, 2024 at 3:59 PM Jakub Jelinek wrote:
>
> On Wed, May 22, 2024 at 09:46:41AM +0200, Richard Biener wrote:
> > On Wed, May 22, 2024 at 3:58 AM liuhongt wrote:
> > >
> > > According to IEEE standard, for conversions from floating point to
> > > integer. When a NaN or infinite
On Wed, May 22, 2024 at 1:07 PM liuhongt wrote:
>
> >> Hard to find a default value satisfying all testcases.
> >> some require loop unroll with 7 insns increment, some don't want loop
> >> unroll w/ 5 insn increment.
> >> The original 2/3 reduction happened to meet all those testcases(or the
>
On Tue, May 21, 2024 at 3:14 PM Haochen Jiang wrote:
>
> Hi all,
>
> This is the v2 patch to fix PR115069. The new testcase has passed.
>
> Changes in v2:
> - Added a testcase.
> - Change the comment for the early exit.
>
> Thx,
> Haochen
>
> Since vpermq is really slow, we should avoid using
On Tue, May 21, 2024 at 2:16 PM Haochen Jiang wrote:
>
> Hi all,
>
> Since vpermq is really slow, we should avoid using it when it is
> the only instruction could be used for ix86_expand_vecop_qihi2.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu. Ok for trunk?
Please add a testcase for
https://gcc.gnu.org/g:0ebaffccb294d90184ad78367de66b6307de3ac0
commit r15-717-g0ebaffccb294d90184ad78367de66b6307de3ac0
Author: liuhongt
Date: Fri Mar 22 14:40:00 2024 +0800
Use pblendw instead of pand to clear upper 16 bits.
For vec_pack_truncv8si/v4si w/o AVX512,
On Wed, May 15, 2024 at 5:24 PM Richard Biener
wrote:
>
> On Wed, May 15, 2024 at 4:15 AM Hongtao Liu wrote:
> >
> > On Mon, May 13, 2024 at 3:40 PM Richard Biener
> > wrote:
> > >
> > > On Mon, May 13, 2024 at 4:29 AM liuhongt wrote:
> > &g
On Wed, May 15, 2024 at 11:30 AM Jiang, Haochen wrote:
>
> Also cc Honza and Richard since we touched generic tune.
>
> Thx,
> Haochen
>
> > -Original Message-
> > From: Haochen Jiang
> > Sent: Wednesday, May 15, 2024 11:04 AM
> > To: gcc-patch
On Fri, May 17, 2024 at 3:55 PM Uros Bizjak wrote:
>
> Rename _3 expander to a standard ssadd,
> usadd, sssub and ussub name to enable corresponding optab expansion.
>
> Also add named expander for MMX modes.
LGTM.
>
> PR middle-end/112600
>
> gcc/ChangeLog:
>
> * config/i386/mmx.md (3):
> >
> Sorry to chime in, for x86 backend, we defined usdot_prodv16hi, and
> 2-way dot_prod operations can be generated
>
This is the link https://godbolt.org/z/hcWr64vx3, x86 define
udot_prodv16qi/udot_prod8hi and both 2-way and 4-way dot_prod
instructions are generated
--
BR,
Hongtao
On Thu, May 16, 2024 at 10:40 PM Victor Do Nascimento
wrote:
>
> From: Victor Do Nascimento
>
> At present, the compiler offers the `{u|s|us}dot_prod_optab' direct
> optabs for dealing with vectorizable dot product code sequences. The
> consequence of using a direct optab for this is that
https://gcc.gnu.org/g:090714e6cf8029f4ff8883dce687200024adbaeb
commit r15-530-g090714e6cf8029f4ff8883dce687200024adbaeb
Author: liuhongt
Date: Wed May 15 10:56:24 2024 +0800
Set d.one_operand_p to true when TARGET_SSSE3 in
ix86_expand_vecop_qihi_partial.
pshufb is available
https://gcc.gnu.org/g:0cc0956b3bb8bcbc9196075b9073a227d799e042
commit r15-529-g0cc0956b3bb8bcbc9196075b9073a227d799e042
Author: liuhongt
Date: Tue May 14 18:39:54 2024 +0800
Optimize ashift >> 7 to vpcmpgtb for vector int8.
Since there is no corresponding instruction, the shift
C -std=gnu++14 LP64 note (test for
> >
> > g++warnings, line 56)
> >
> > g++: g++.dg/warn/Warray-bounds-20.C -std=gnu++14 note (test for
> >
> > g++warnings, line 66)
> >
> > g++: g++.dg/warn/Warray-bounds-20.C -std=gnu++17 LP64 note (test for
> >
> > g++warnings, line 56)
> >
> > g++:
https://gcc.gnu.org/g:a71f90c5a7ae2942083921033cb23dcd63e70525
commit r15-499-ga71f90c5a7ae2942083921033cb23dcd63e70525
Author: Levy Hsu
Date: Thu May 9 16:50:56 2024 +0800
x86: Add 3-instruction subroutine vector shift for V16QI in
ix86_expand_vec_perm_const_1 [PR107563]
Hi All
On Mon, May 13, 2024 at 3:40 PM Richard Biener
wrote:
>
> On Mon, May 13, 2024 at 4:29 AM liuhongt wrote:
> >
> > As testcase in the PR, O3 cunrolli may prevent vectorization for the
> > innermost loop and increase register pressure.
> > The patch removes the 1/3 reduction of unr_insn for
On Mon, May 13, 2024 at 5:57 AM Roger Sayle wrote:
>
>
> This patch improves the way that the x86 backend recognizes and
> expands AVX512's bitwise ternary logic (vpternlog) instructions.
I like the patch.
1 file changed, 25 insertions(+), 1 deletion(-)
gcc/config/i386/i386-expand.cc | 26
s, translate
+ * them as needed.
+ */
nitpicking: This should probably be
+ * them as needed. */
CC'ing Jonathan Yong. This series of patches look good to me.
--
Best regards,
LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
ld also fix this mem operand
> issue. I hope to submit it for review this weekend.
I opened a PR for that. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115021
>
> Thanks again,
> Roger
>
> > From: Hongtao Liu
> > On Fri, May 10, 2024 at 6:26 AM Roger Sayle
> > w
On Fri, May 10, 2024 at 6:26 AM Roger Sayle wrote:
>
>
> The following one line patch improves the code generated for V8QI and V4QI
> shifts when AV512BW and AVX512VL functionality is available.
+ /* With AVX512 its cheaper to do vpmovsxbw/op/vpmovwb. */
+ && !(TARGET_AVX512BW &&
ode()` be called only when `handle` is valid? I think you may initialize
`isconsole` to `false`; then only if the handle is valid, should it be set accordingly; and this
function just returns `isconsole`.
The other two patches look good to me.
--
Best regards,
LIU Hao
OpenPGP_signature
On Wed, May 8, 2024 at 10:13 AM Hu, Lin1 wrote:
>
> Hi all,
>
> This patch aims to fix some intrinsics without alignment requirement, but
> raised runtime error's problem.
>
> Bootstrapped and tested on x86_64-linux-gnu, OK for trunk?
Ok.
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
> PR
On Mon, May 6, 2024 at 3:40 PM Kong, Lingling wrote:
>
> Hi,
> Originally eliminate_regs_in_insn will transform
> (parallel [
> (set (reg:QI 130)
> (plus:QI (subreg:QI (reg:DI 19 frame) 0)
> (const_int 96)))
> (clobber (reg:CC 17 flag))]) {*addqi_1}
> to
> (set (reg:QI 130)
>
https://gcc.gnu.org/g:a9f642783853b60bb0a59562b8ab3ed10ec01641
commit r15-234-ga9f642783853b60bb0a59562b8ab3ed10ec01641
Author: liuhongt
Date: Wed Dec 20 11:54:43 2023 +0800
Optimize 64-bit vector permutation with punpcklqdq + 128-bit vector pshuf.
gcc/ChangeLog:
https://gcc.gnu.org/g:8b974f54393ab2d2d16a0051a68c155455a92aad
commit r15-236-g8b974f54393ab2d2d16a0051a68c155455a92aad
Author: liuhongt
Date: Mon Jan 8 15:13:41 2024 +0800
Extend usdot_prodv*qi with vpmaddwd when AVXVNNI/AVX512VNNI is not
available.
gcc/ChangeLog:
https://gcc.gnu.org/g:fa911365490a7ca308878517a4af6189ffba7ed6
commit r15-235-gfa911365490a7ca308878517a4af6189ffba7ed6
Author: liuhongt
Date: Wed Dec 20 11:43:25 2023 +0800
Support dot_prod optabs for 64-bit vector.
gcc/ChangeLog:
PR target/113079
*
CC uros.
On Mon, May 6, 2024 at 11:03 AM Kong, Lingling wrote:
>
> Hi,
> (if_then_else:SI (eq (reg:CCZ 17 flags)
> (const_int 0 [0]))
> (reg/v:SI 101 [ e ])
> (reg:SI 102))
> The cost is 8 for the rtx, the cost for
> (eq (reg:CCZ 17 flags) (const_int 0 [0])) is 4, but this is
https://gcc.gnu.org/g:affd77d3fe7bfb525b3fb23316d164e847ed02d1
commit r15-167-gaffd77d3fe7bfb525b3fb23316d164e847ed02d1
Author: liuhongt
Date: Wed Mar 27 08:20:13 2024 +0800
Update libbid according to the latest Intel Decimal Floating-Point Math
Library.
The Intel Decimal
On Tue, Apr 30, 2024 at 3:38 PM Jakub Jelinek wrote:
>
> On Tue, Apr 30, 2024 at 09:30:00AM +0200, Richard Biener wrote:
> > On Mon, Apr 29, 2024 at 5:30 PM H.J. Lu wrote:
> > >
> > > On Mon, Apr 29, 2024 at 6:47 AM liuhongt wrote:
> > > >
> > > > The Fortran standard does not specify what the
https://gcc.gnu.org/g:c19a674d03847b900919b97d0957c8ae5164f8f1
commit r15-22-gc19a674d03847b900919b97d0957c8ae5164f8f1
Author: liuhongt
Date: Tue Apr 16 08:37:22 2024 +0800
Adjust alternative *k to ?k for avx512 mask in zero_extend patterns
So when both source operand and dest
Attached is an alternative patch to functionalize `load_macros_array`. It allows GCC to build on
x86_64-w64-mingw32. Not tested though, as I know no Rust.
As before, please edit the patch at your disposal.
--
Best regards,
LIU Hao
diff --git a/gcc/rust/checks/errors/borrowck/rust-borrow
Attached is an alternative patch to functionalize `load_macros_array`. It allows GCC to build on
x86_64-w64-mingw32. Not tested though, as I know no Rust.
As before, please edit the patch at your disposal.
--
Best regards,
LIU Hao
diff --git a/gcc/rust/checks/errors/borrowck/rust-borrow
Rust.
--
Best regards,
LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
Hello,
Attached is a patch for fixing build issues on *-w64-mingw32. Please check and
update at your leisure.
'gcc/system.h' contains a macro called `mkdir()` and there is no need to invoke `_mkdir()` within a
conditional block.
--
Best regards,
LIU Hao
diff --git a/gcc/rust/checks/errors
On Wed, Apr 24, 2024 at 1:46 PM Haochen Jiang wrote:
>
> Hi all,
>
> When we are using -mavx10.1-256 in command line and avx10.1-256 in
> target attribute together, zmm should never be generated. But current
> GCC will generate zmm since it wrongly enables EVEX512 for non-explicitly
> set AVX512.
On Sat, Apr 13, 2024 at 6:42 AM H.J. Lu wrote:
>
> The x86 instruction size limit is 15 bytes. If a NDD instruction has
> a segment prefix byte, a 4-byte opcode prefix, a MODRM byte, a SIB byte,
> a 4-byte displacement and a 4-byte immediate, adding an address size
> prefix will exceed the size
> -Original Message-
> From: Jakub Jelinek
> Sent: Thursday, April 11, 2024 4:39 PM
> To: Richard Biener ; Jeff Law ;
> Liu, Hongtao
> Cc: gcc-patches@gcc.gnu.org
> Subject: [PATCH] asan, v3: Fix up handling of > 32 byte aligned variables
> with -
On Tue, Apr 9, 2024 at 3:05 PM Hongyu Wang wrote:
>
> The latest APX spec announced removal of SHA/KEYLOCKER evex promotion [1],
> which means the SHA/KEYLOCKER insn does not support EGPR when APX
> enabled. Update the corresponding constraints to their EGPR-disabled
> counterparts.
>
>
On Tue, Apr 9, 2024 at 5:18 PM Jakub Jelinek wrote:
>
> On Tue, Apr 09, 2024 at 11:23:40AM +0800, Hongtao Liu wrote:
> > I think we can merge alternative 2 with 3 to
> > * return TARGET_AES ? \"vaesenc\t{%2, %1, %0|%0, %1, %2}"\" :
> > \&q
On Thu, Apr 4, 2024 at 4:42 PM Jakub Jelinek wrote:
>
> On Wed, Apr 19, 2023 at 02:40:59AM +, Jiang, Haochen via Gcc-patches
> wrote:
> > > > (define_insn "aesenc"
> > > > - [(set (match_operand:V2DI 0 "register_operand" "=x,x")
> > > > - (unspec:V2DI [(match_operand:V2DI 1
On Tue, Apr 9, 2024 at 9:58 AM H.J. Lu wrote:
>
> Define __APX_INLINE_ASM_USE_GPR32__ for -mapx-inline-asm-use-gpr32.
> When __APX_INLINE_ASM_USE_GPR32__ is defined, inline asm statements
> should contain only instructions compatible with r16-r31.
Ok.
>
> gcc/
>
> PR target/114587
>
On Mon, Apr 8, 2024 at 11:44 PM H.J. Lu wrote:
>
> Define following macros for APX options:
>
> 1. __APX_EGPR__: -mapx-features=egpr.
> 2. __APX_PUSH2POP2__: -mapx-features=push2pop2.
> 3. __APX_NDD__: -mapx-features=ndd.
> 4. __APX_PPX__: -mapx-features=ppx.
For -mapx-features=, we haven't
On Tue, Mar 26, 2024 at 11:26 AM Hongtao Liu wrote:
>
> On Mon, Mar 25, 2024 at 8:51 PM Jakub Jelinek wrote:
> >
> > On Tue, Mar 12, 2024 at 07:57:59PM +0800, liuhongt wrote:
> > > if alignb > ASAN_RED_ZONE_SIZE and offset[0] is not multiple of
> > > alig
On Mon, Mar 25, 2024 at 8:51 PM Jakub Jelinek wrote:
>
> On Tue, Mar 12, 2024 at 07:57:59PM +0800, liuhongt wrote:
> > if alignb > ASAN_RED_ZONE_SIZE and offset[0] is not multiple of
> > alignb. (base_align_bias - base_offset) may not aligned to alignb, and
> > caused segement fault.
> >
> >
https://gcc.gnu.org/g:e6a3d1f5bcfd954b614155d96c97bde8ac230e2e
commit r13-8488-ge6a3d1f5bcfd954b614155d96c97bde8ac230e2e
Author: liuhongt
Date: Fri Mar 22 10:09:43 2024 +0800
Move pr114396.c from gcc.target/i386 to gcc.c-torture/execute.
Also fixed a typo in the testcase.
https://gcc.gnu.org/g:9a6c7aa1b011b77fcd9b19f7b8d7ff0fc823cdb2
commit r14-9603-g9a6c7aa1b011b77fcd9b19f7b8d7ff0fc823cdb2
Author: liuhongt
Date: Fri Mar 22 10:09:43 2024 +0800
Move pr114396.c from gcc.target/i386 to gcc.c-torture/execute.
Also fixed a typo in the testcase.
https://gcc.gnu.org/g:199b021a38f30b681e0dbecd2d0296beabd50b13
commit r13-8475-g199b021a38f30b681e0dbecd2d0296beabd50b13
Author: liuhongt
Date: Thu Mar 21 13:15:23 2024 +0800
Fix runtime error for nonlinear iv vectorization(step_mult).
wi::from_mpz doesn't take a sign argument,
https://gcc.gnu.org/g:ac2f8c2a367151fc0410f904339c475a953cffc8
commit r14-9591-gac2f8c2a367151fc0410f904339c475a953cffc8
Author: liuhongt
Date: Thu Mar 21 13:15:23 2024 +0800
Fix runtime error for nonlinear iv vectorization(step_mult).
wi::from_mpz doesn't take a sign argument,
nch below clearly
eliminates the dependency.
}
else
{
// The architects say this is safe even for 0.
res = -1;
asm("bsf %1, %0" : "+r"(res) : "rm"(x));
}
return res + 1;
}
--
Best regards,
LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
https://gcc.gnu.org/g:415091f09096a0ebba1fdcd4af8c2fda24cfd411
commit r14-9588-g415091f09096a0ebba1fdcd4af8c2fda24cfd411
Author: liuhongt
Date: Mon Mar 18 18:53:59 2024 +0800
Document -fexcess-precision=16.
gcc/ChangeLog:
PR middle-end/114347
*
> So - OK with using { target vect_int } instead.
Sure, it's much better to be target independent.
Refactored and committed in r14-9569-g4c276896
Thanks,
- Hao
From: Richard Biener
Sent: Wednesday, March 20, 2024 16:21
To: Hao Liu OS
Cc: GCC-patc
https://gcc.gnu.org/g:4c276896d646c2dbc8047fd81d6e65f8c5ecf01d
commit r14-9569-g4c276896d646c2dbc8047fd81d6e65f8c5ecf01d
Author: Hao Liu
Date: Wed Mar 20 17:37:01 2024 +0800
testsuite: add the case to cover the vectorization of A[(i+x)*stride]
[PR114322]
This issues has been
Hi Richard,
As mentioned in the comments of PR114322 (which has been fixed by PR114151
r14-9540-ge0e9499a), this patch is to cover the case.
Bootstrapped and regression tested on aarch64-linux-gnu, OK for trunk?
gcc/testsuite/ChangeLog:
PR tree-optimization/114322
*
On Tue, Mar 19, 2024 at 12:16 AM Joseph Myers wrote:
>
> On Mon, 18 Mar 2024, liuhongt wrote:
>
> > +If @option{-fexcess-precision=16} is specified, casts and assignments of
> > +@code{_Float16} and @code{bfloat16_t} cause value to be rounded to their
> > +semantic types if they're supported by
On Mon, Mar 18, 2024 at 6:59 PM Uros Bizjak wrote:
>
> On Mon, Mar 18, 2024 at 11:52 AM liuhongt wrote:
> >
> > Commit r14-9459-g618e34d56cc38e only handles
> > general_scalar_chain::convert_op. The patch also handles
> > timode_scalar_chain::convert_op to avoid potential similar bug.
> >
> >
https://gcc.gnu.org/g:942d470a5a4fb1baeff943127a81b441dffaa543
commit r14-9512-g942d470a5a4fb1baeff943127a81b441dffaa543
Author: liuhongt
Date: Fri Mar 15 10:59:10 2024 +0800
Add missing hf/bf patterns.
It will be used by copysignm3/xorsignm3/lroundmn2 expanders.
On Thu, Mar 14, 2024 at 11:42 PM Andrew Stubbs wrote:
>
> Don't enable excess lanes when inverting vector bit-masks smaller than the
> integer mode. This is yet another case of wrong-code due to mishandling
> of oversized bitmasks.
>
> This issue shows up in vect/tsvc/vect-tsvc-s278.c and
>
On Thu, Mar 14, 2024 at 10:46 PM Uros Bizjak wrote:
>
> On Thu, Mar 14, 2024 at 8:42 AM Uros Bizjak wrote:
> >
> > On Thu, Mar 14, 2024 at 8:32 AM Hongtao Liu wrote:
> > >
> > > On Thu, Mar 14, 2024 at 3:22 PM Uros Bizjak wrote:
> > > >
> &g
https://gcc.gnu.org/g:a861f940efffae2782c559cd04df2d2740cd28bd
commit r12-10214-ga861f940efffae2782c559cd04df2d2740cd28bd
Author: liuhongt
Date: Wed Mar 13 10:40:01 2024 +0800
i386[stv]: Handle REG_EH_REGION note
When we split
(insn 37 36 38 10 (set (reg:DI 104 [ _18 ])
https://gcc.gnu.org/g:bdbcfbfcf591381f0faf95c881e3772b56d0a404
commit r13-8438-gbdbcfbfcf591381f0faf95c881e3772b56d0a404
Author: liuhongt
Date: Wed Mar 13 10:40:01 2024 +0800
i386[stv]: Handle REG_EH_REGION note
When we split
(insn 37 36 38 10 (set (reg:DI 104 [ _18 ])
https://gcc.gnu.org/g:618e34d56cc38e9c3ae95a413228068e53ed76bb
commit r14-9459-g618e34d56cc38e9c3ae95a413228068e53ed76bb
Author: liuhongt
Date: Wed Mar 13 10:40:01 2024 +0800
i386[stv]: Handle REG_EH_REGION note
When we split
(insn 37 36 38 10 (set (reg:DI 104 [ _18 ])
On Thu, Mar 14, 2024 at 3:22 PM Uros Bizjak wrote:
>
> On Thu, Mar 14, 2024 at 2:33 AM liuhongt wrote:
> >
> > When we split
> > (insn 37 36 38 10 (set (reg:DI 104 [ _18 ])
> > (mem:DI (reg/f:SI 98 [ CallNative_nclosure.0_1 ]) [6 MEM[(struct
> > SQRefCounted
On Tue, Mar 12, 2024 at 8:00 PM liuhongt wrote:
>
> if alignb > ASAN_RED_ZONE_SIZE and offset[0] is not multiple of
> alignb. (base_align_bias - base_offset) may not aligned to alignb, and
> caused segement fault.
>
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> Ok for trunk and
May I suggest you keep the mcf thread model for aarch-w64-mingw32? I requested Martin Storsjö to
test it on a physical Windows 11 on ARM machine with Clang and all tests passed. I think it should
work once the GCC support is complete.
--
Best regards,
LIU Hao
OpenPGP_signature.asc
On Thu, Feb 29, 2024 at 2:20 PM Hongtao Liu wrote:
>
> On Wed, Feb 28, 2024 at 4:54 PM Jakub Jelinek wrote:
> >
> > Hi!
> >
> > Adding Hongtao and Honza into the loop as the ones who acked the original
> > patch.
> >
> > The no_callee_saved_regist
On Wed, Feb 28, 2024 at 4:54 PM Jakub Jelinek wrote:
>
> Hi!
>
> Adding Hongtao and Honza into the loop as the ones who acked the original
> patch.
>
> The no_callee_saved_registers by default for noreturn functions change can
> break in-process backtrace(3) or backtraces from debugger or other
1 - 100 of 1788 matches
Mail list logo