[gcc r15-1308] Adjust ix86_rtx_costs for pternlog_operand_p.

2024-06-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:d3fae2bea034edb001cd45d1d86c5ceef146899b commit r15-1308-gd3fae2bea034edb001cd45d1d86c5ceef146899b Author: liuhongt Date: Tue Jun 11 21:22:42 2024 +0800 Adjust ix86_rtx_costs for pternlog_operand_p. r15-1100-gec985bc97a0157 improves handling of ternlog

[gcc r15-1307] Remove one_if_conv for latest Intel processors.

2024-06-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:8b69efd9819f86b973d7a550e987ce455fce6d62 commit r15-1307-g8b69efd9819f86b973d7a550e987ce455fce6d62 Author: liuhongt Date: Mon Jun 3 10:38:19 2024 +0800 Remove one_if_conv for latest Intel processors. The tune is added by PR79390 for SciMark2 on Broadwell.

[gcc r15-1234] Fix ICE due to REGNO of a SUBREG.

2024-06-12 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:f8bf80a4e1682b2238baad8c44939682f96b1fe0 commit r15-1234-gf8bf80a4e1682b2238baad8c44939682f96b1fe0 Author: liuhongt Date: Thu Jun 13 09:53:58 2024 +0800 Fix ICE due to REGNO of a SUBREG. Use reg_or_subregno instead. gcc/ChangeLog:

[gcc r15-1191] Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P

2024-06-11 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:1d496d2cd1d5d8751a1637abca89339d6f9ddd3b commit r15-1191-g1d496d2cd1d5d8751a1637abca89339d6f9ddd3b Author: liuhongt Date: Tue Jun 11 10:23:27 2024 +0800 Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P The patch add extra check to make

[gcc r12-10497] Disable FMADD in chains for Zen4 and generic

2024-06-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:5d52558a531130675329d72ca5c4713abf5bf885 commit r12-10497-g5d52558a531130675329d72ca5c4713abf5bf885 Author: Jan Hubicka Date: Fri Dec 29 23:51:03 2023 +0100 Disable FMADD in chains for Zen4 and generic this patch disables use of FMA in matrix multiplication

[gcc r13-8825] Disable FMADD in chains for Zen4 and generic

2024-06-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e4f85ea6271a10e13c6874709a05e04ab0508fbf commit r13-8825-ge4f85ea6271a10e13c6874709a05e04ab0508fbf Author: Jan Hubicka Date: Fri Dec 29 23:51:03 2023 +0100 Disable FMADD in chains for Zen4 and generic this patch disables use of FMA in matrix multiplication

[gcc r15-1088] Add additional option --param max-completely-peeled-insns=200 for power64*-*-*

2024-06-06 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b24f2954dbc13d85e9fb62e05a88e9df21e4d4f4 commit r15-1088-gb24f2954dbc13d85e9fb62e05a88e9df21e4d4f4 Author: liuhongt Date: Fri Jun 7 09:29:24 2024 +0800 Add additional option --param max-completely-peeled-insns=200 for power64*-*-* gcc/testsuite/ChangeLog:

[gcc r15-1050] Refine testcase for power10.

2024-06-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:fcfce55c85f842ed843cbc4aabe744c6a004dead commit r15-1050-gfcfce55c85f842ed843cbc4aabe744c6a004dead Author: liuhongt Date: Thu Jun 6 11:27:53 2024 +0800 Refine testcase for power10. For power10, there're extra 3 REG_EQUIV notes with (fix:SI. to avoid the

[gcc r15-1048] Adjust rtx_cost for MEM to enable more simplication

2024-06-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:961dd0d635217c703a38c48903981e0d60962546 commit r15-1048-g961dd0d635217c703a38c48903981e0d60962546 Author: liuhongt Date: Fri Apr 19 10:39:53 2024 +0800 Adjust rtx_cost for MEM to enable more simplication For CONST_VECTOR_DUPLICATE_P in constant_pool, it is

[gcc r15-1047] Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode.

2024-06-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:7876cde25cbd2f026a0ae488e5263e72f8e9bfa0 commit r15-1047-g7876cde25cbd2f026a0ae488e5263e72f8e9bfa0 Author: liuhongt Date: Fri Apr 19 10:29:34 2024 +0800 Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode. When mask is (1 << (prec - imm)

[gcc r15-1022] Don't simplify NAN/INF or out-of-range constant for FIX/UNSIGNED_FIX.

2024-06-04 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b05288d1f1e4b632eddf8830b4369d4659f6c2ff commit r15-1022-gb05288d1f1e4b632eddf8830b4369d4659f6c2ff Author: liuhongt Date: Tue May 21 16:57:17 2024 +0800 Don't simplify NAN/INF or out-of-range constant for FIX/UNSIGNED_FIX. According to IEEE standard, for

[gcc r15-1003] Adjust testcase for -march=cascadelake

2024-06-03 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:4d207044195b97ecb27c72a7dc987eb8b86644a0 commit r15-1003-g4d207044195b97ecb27c72a7dc987eb8b86644a0 Author: liuhongt Date: Tue Jun 4 10:13:09 2024 +0800 Adjust testcase for -march=cascadelake gcc/testsuite/ChangeLog: PR target/115299

[gcc r15-984] Add some preference for floating point rtl ifcvt when sse4.1 is not available

2024-06-03 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ac306de7d5100d3682eae2270995a9abbe19db38 commit r15-984-gac306de7d5100d3682eae2270995a9abbe19db38 Author: liuhongt Date: Fri May 31 14:38:07 2024 +0800 Add some preference for floating point rtl ifcvt when sse4.1 is not available W/o TARGET_SSE4_1, it takes

[gcc r15-932] Rename double_u with __double_u to avoid pulluting the namespace.

2024-05-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:3a873c0a7bc8183de95a6103b507101a25eed413 commit r15-932-g3a873c0a7bc8183de95a6103b507101a25eed413 Author: liuhongt Date: Thu May 30 14:15:48 2024 +0800 Rename double_u with __double_u to avoid pulluting the namespace. gcc/ChangeLog: *

[gcc r15-920] Support vcond_mask_qiqi and friends.

2024-05-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b6c6d5abf0d31c936f50f8f9073c5e335b9e24b7 commit r15-920-gb6c6d5abf0d31c936f50f8f9073c5e335b9e24b7 Author: liuhongt Date: Wed Feb 28 11:17:10 2024 +0800 Support vcond_mask_qiqi and friends. gcc/ChangeLog: * config/i386/sse.md (vcond_mask_):

[gcc r15-919] Don't reduce estimated unrolled size for innermost loop.

2024-05-29 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ef27b91b62c3aa8841c02665dffa8914c742fd37 commit r15-919-gef27b91b62c3aa8841c02665dffa8914c742fd37 Author: liuhongt Date: Tue Feb 27 15:34:57 2024 +0800 Don't reduce estimated unrolled size for innermost loop. For the innermost loop, after completely loop

[gcc r15-882] Reduce cost of MEM (A + imm).

2024-05-28 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:1d6199e5f8c1c08083eeb0279f71333234fe14ad commit r15-882-g1d6199e5f8c1c08083eeb0279f71333234fe14ad Author: liuhongt Date: Mon Feb 19 13:57:24 2024 +0800 Reduce cost of MEM (A + imm). For MEM, rtx_cost iterates each subrtx, and adds up the costs, so for

[gcc r15-857] Fix predicate mismatch between vfcmaddcph's define_insn and define_expand.

2024-05-27 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:c65002347e595cda8b15e59e734d209283faf2b6 commit r15-857-gc65002347e595cda8b15e59e734d209283faf2b6 Author: liuhongt Date: Tue May 28 10:32:12 2024 +0800 Fix predicate mismatch between vfcmaddcph's define_insn and define_expand. When I applied Roger's patch

[gcc r15-814] Fix typo in the testcase.

2024-05-24 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:51f4b47c4f4f61fe31a7bd1fa80e08c2438d76a8 commit r15-814-g51f4b47c4f4f61fe31a7bd1fa80e08c2438d76a8 Author: liuhongt Date: Fri May 24 09:49:08 2024 +0800 Fix typo in the testcase. gcc/testsuite/ChangeLog: PR target/114148 *

[gcc r15-717] Use pblendw instead of pand to clear upper 16 bits.

2024-05-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:0ebaffccb294d90184ad78367de66b6307de3ac0 commit r15-717-g0ebaffccb294d90184ad78367de66b6307de3ac0 Author: liuhongt Date: Fri Mar 22 14:40:00 2024 +0800 Use pblendw instead of pand to clear upper 16 bits. For vec_pack_truncv8si/v4si w/o AVX512,

[gcc r15-530] Set d.one_operand_p to true when TARGET_SSSE3 in ix86_expand_vecop_qihi_partial.

2024-05-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:090714e6cf8029f4ff8883dce687200024adbaeb commit r15-530-g090714e6cf8029f4ff8883dce687200024adbaeb Author: liuhongt Date: Wed May 15 10:56:24 2024 +0800 Set d.one_operand_p to true when TARGET_SSSE3 in ix86_expand_vecop_qihi_partial. pshufb is available

[gcc r15-529] Optimize ashift >> 7 to vpcmpgtb for vector int8.

2024-05-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:0cc0956b3bb8bcbc9196075b9073a227d799e042 commit r15-529-g0cc0956b3bb8bcbc9196075b9073a227d799e042 Author: liuhongt Date: Tue May 14 18:39:54 2024 +0800 Optimize ashift >> 7 to vpcmpgtb for vector int8. Since there is no corresponding instruction, the shift

[gcc r15-499] x86: Add 3-instruction subroutine vector shift for V16QI in ix86_expand_vec_perm_const_1 [PR107563]

2024-05-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a71f90c5a7ae2942083921033cb23dcd63e70525 commit r15-499-ga71f90c5a7ae2942083921033cb23dcd63e70525 Author: Levy Hsu Date: Thu May 9 16:50:56 2024 +0800 x86: Add 3-instruction subroutine vector shift for V16QI in ix86_expand_vec_perm_const_1 [PR107563] Hi All

[gcc r15-234] Optimize 64-bit vector permutation with punpcklqdq + 128-bit vector pshuf.

2024-05-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a9f642783853b60bb0a59562b8ab3ed10ec01641 commit r15-234-ga9f642783853b60bb0a59562b8ab3ed10ec01641 Author: liuhongt Date: Wed Dec 20 11:54:43 2023 +0800 Optimize 64-bit vector permutation with punpcklqdq + 128-bit vector pshuf. gcc/ChangeLog:

[gcc r15-236] Extend usdot_prodv*qi with vpmaddwd when AVXVNNI/AVX512VNNI is not available.

2024-05-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:8b974f54393ab2d2d16a0051a68c155455a92aad commit r15-236-g8b974f54393ab2d2d16a0051a68c155455a92aad Author: liuhongt Date: Mon Jan 8 15:13:41 2024 +0800 Extend usdot_prodv*qi with vpmaddwd when AVXVNNI/AVX512VNNI is not available. gcc/ChangeLog:

[gcc r15-235] Support dot_prod optabs for 64-bit vector.

2024-05-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:fa911365490a7ca308878517a4af6189ffba7ed6 commit r15-235-gfa911365490a7ca308878517a4af6189ffba7ed6 Author: liuhongt Date: Wed Dec 20 11:43:25 2023 +0800 Support dot_prod optabs for 64-bit vector. gcc/ChangeLog: PR target/113079 *

[gcc r15-167] Update libbid according to the latest Intel Decimal Floating-Point Math Library.

2024-05-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:affd77d3fe7bfb525b3fb23316d164e847ed02d1 commit r15-167-gaffd77d3fe7bfb525b3fb23316d164e847ed02d1 Author: liuhongt Date: Wed Mar 27 08:20:13 2024 +0800 Update libbid according to the latest Intel Decimal Floating-Point Math Library. The Intel Decimal

[gcc r15-22] Adjust alternative *k to ?k for avx512 mask in zero_extend patterns

2024-04-28 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:c19a674d03847b900919b97d0957c8ae5164f8f1 commit r15-22-gc19a674d03847b900919b97d0957c8ae5164f8f1 Author: liuhongt Date: Tue Apr 16 08:37:22 2024 +0800 Adjust alternative *k to ?k for avx512 mask in zero_extend patterns So when both source operand and dest

[gcc r13-8488] Move pr114396.c from gcc.target/i386 to gcc.c-torture/execute.

2024-03-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e6a3d1f5bcfd954b614155d96c97bde8ac230e2e commit r13-8488-ge6a3d1f5bcfd954b614155d96c97bde8ac230e2e Author: liuhongt Date: Fri Mar 22 10:09:43 2024 +0800 Move pr114396.c from gcc.target/i386 to gcc.c-torture/execute. Also fixed a typo in the testcase.

[gcc r14-9603] Move pr114396.c from gcc.target/i386 to gcc.c-torture/execute.

2024-03-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:9a6c7aa1b011b77fcd9b19f7b8d7ff0fc823cdb2 commit r14-9603-g9a6c7aa1b011b77fcd9b19f7b8d7ff0fc823cdb2 Author: liuhongt Date: Fri Mar 22 10:09:43 2024 +0800 Move pr114396.c from gcc.target/i386 to gcc.c-torture/execute. Also fixed a typo in the testcase.

[gcc r13-8475] Fix runtime error for nonlinear iv vectorization(step_mult).

2024-03-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:199b021a38f30b681e0dbecd2d0296beabd50b13 commit r13-8475-g199b021a38f30b681e0dbecd2d0296beabd50b13 Author: liuhongt Date: Thu Mar 21 13:15:23 2024 +0800 Fix runtime error for nonlinear iv vectorization(step_mult). wi::from_mpz doesn't take a sign argument,

[gcc r14-9591] Fix runtime error for nonlinear iv vectorization(step_mult).

2024-03-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ac2f8c2a367151fc0410f904339c475a953cffc8 commit r14-9591-gac2f8c2a367151fc0410f904339c475a953cffc8 Author: liuhongt Date: Thu Mar 21 13:15:23 2024 +0800 Fix runtime error for nonlinear iv vectorization(step_mult). wi::from_mpz doesn't take a sign argument,

[gcc r14-9588] Document -fexcess-precision=16.

2024-03-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:415091f09096a0ebba1fdcd4af8c2fda24cfd411 commit r14-9588-g415091f09096a0ebba1fdcd4af8c2fda24cfd411 Author: liuhongt Date: Mon Mar 18 18:53:59 2024 +0800 Document -fexcess-precision=16. gcc/ChangeLog: PR middle-end/114347 *

[gcc r14-9512] Add missing hf/bf patterns.

2024-03-17 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:942d470a5a4fb1baeff943127a81b441dffaa543 commit r14-9512-g942d470a5a4fb1baeff943127a81b441dffaa543 Author: liuhongt Date: Fri Mar 15 10:59:10 2024 +0800 Add missing hf/bf patterns. It will be used by copysignm3/xorsignm3/lroundmn2 expanders.

[gcc r12-10214] i386[stv]: Handle REG_EH_REGION note

2024-03-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a861f940efffae2782c559cd04df2d2740cd28bd commit r12-10214-ga861f940efffae2782c559cd04df2d2740cd28bd Author: liuhongt Date: Wed Mar 13 10:40:01 2024 +0800 i386[stv]: Handle REG_EH_REGION note When we split (insn 37 36 38 10 (set (reg:DI 104 [ _18 ])

[gcc r13-8438] i386[stv]: Handle REG_EH_REGION note

2024-03-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:bdbcfbfcf591381f0faf95c881e3772b56d0a404 commit r13-8438-gbdbcfbfcf591381f0faf95c881e3772b56d0a404 Author: liuhongt Date: Wed Mar 13 10:40:01 2024 +0800 i386[stv]: Handle REG_EH_REGION note When we split (insn 37 36 38 10 (set (reg:DI 104 [ _18 ])

[gcc r14-9459] i386[stv]: Handle REG_EH_REGION note

2024-03-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:618e34d56cc38e9c3ae95a413228068e53ed76bb commit r14-9459-g618e34d56cc38e9c3ae95a413228068e53ed76bb Author: liuhongt Date: Wed Mar 13 10:40:01 2024 +0800 i386[stv]: Handle REG_EH_REGION note When we split (insn 37 36 38 10 (set (reg:DI 104 [ _18 ])