On Thu, May 23, 2024 at 7:53 PM Evgeny Karpov
wrote:
>
>
> Thursday, May 23, 2024 10:35 AM
> Uros Bizjak wrote:
>
> > Richard Sandiford wrote:
> > >
> > > > This looks good to me apart from a couple of very minor comments
> > > > below, bu
On Thu, May 23, 2024 at 10:35 AM Uros Bizjak wrote:
>
> On Wed, May 22, 2024 at 4:32 PM Evgeny Karpov
> wrote:
> >
> > Wednesday, May 22, 2024 1:06 PM
> > Richard Sandiford wrote:
> >
> > > This looks good to me apart from a couple of very minor c
On Wed, May 22, 2024 at 4:32 PM Evgeny Karpov
wrote:
>
> Wednesday, May 22, 2024 1:06 PM
> Richard Sandiford wrote:
>
> > This looks good to me apart from a couple of very minor comments below, but
> > please get approval from the x86 maintainers as well. In particular, they
> > might
> >
On Wed, May 22, 2024 at 5:15 PM Roger Sayle wrote:
>
> This single line patch fixes a strange quirk/glitch in i386's rtx_costs,
> which considers an instruction loading a 64-bit constant to be significantly
> cheaper than loading a 32-bit (or smaller) constant.
>
> Consider the two functions:
>
On Wed, May 22, 2024 at 10:29 AM Kong, Lingling wrote:
>
> > I wonder if we can use "define_subst" to conditionally add flags clobber
> > for !TARGET_APX_NF targets. Even the example for "Define Subst" uses the
> > insn
> > w/ and w/o the clobber, so I think it is worth considering this
On Tue, May 21, 2024 at 11:01 AM Haochen Jiang wrote:
>
> Hi all,
>
> This is the v3 patch to fix PR115069. The new testcase has passed.
>
> Changes in v3:
> - Simplify the testcase.
>
> Changes in v2:
> - Add a testcase.
> - Change the comment for the early exit.
>
> Thx,
> Haochen
>
>
On Tue, May 21, 2024 at 7:13 AM liuhongt wrote:
>
> For CONST_VECTOR_DUPLICATE_P in constant_pool, it is just broadcast or
> variants in ix86_vector_duplicate_simode_const.
> Adjust the cost to COSTS_N_INSNS (2) + speed which should be a little
> bit larger than broadcast.
>
> Bootstrapped and
On Tue, May 21, 2024 at 8:16 AM Haochen Jiang wrote:
>
> Hi all,
>
> Since vpermq is really slow, we should avoid using it when it is
> the only instruction could be used for ix86_expand_vecop_qihi2.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu. Ok for trunk?
>
> Thx,
> Haochen
>
>
https://gcc.gnu.org/g:b59de4113262f2bee14147eb17eb3592f03d9556
commit r15-634-gb59de4113262f2bee14147eb17eb3592f03d9556
Author: Uros Bizjak
Date: Fri May 17 09:55:49 2024 +0200
i386: Rename sat_plusminus expanders to standard names [PR112600]
Rename _3 expander to a standard
Rename _3 expander to a standard ssadd,
usadd, sssub and ussub name to enable corresponding optab expansion.
Also add named expander for MMX modes.
PR middle-end/112600
gcc/ChangeLog:
* config/i386/mmx.md (3): New expander.
* config/i386/sse.md
(_3):
Rename expander to 3.
On Wed, May 15, 2024 at 12:05 PM liuhongt wrote:
>
> pshufb is available under TARGET_SSSE3, so
> ix86_expand_vec_perm_const_1 must return true when TARGET_SSSE3.
> w/o TARGET_SSSE3, if we set one_operand_p to true,
> ix86_expand_vec_perm_const_1 could return false.
>
> With the patch under
On Wed, May 15, 2024 at 9:43 AM Kong, Lingling wrote:
>
> From: Hongyu Wang
>
> APX NF(no flags) feature implements suppresses the update of status flags for
> arithmetic operations.
>
> For NF add, it is not clear whether NF add can be faster than lea. If so, the
> pattern needs to be
On Wed, May 15, 2024 at 9:43 AM Kong, Lingling wrote:
>
> From: Hongyu Wang
>
> APX NF(no flags) feature implements suppresses the update of status flags for
> arithmetic operations.
>
> For NF add, it is not clear whether NF add can be faster than lea. If so, the
> pattern needs to be
On Thu, May 9, 2024 at 11:12 AM Levy Hsu wrote:
>
> Hi All
>
> We've introduced a new subroutine in ix86_expand_vec_perm_const_1
> to optimize vector shifting for the V16QI type on x86.
> This patch uses a three-instruction sequence psrlw, psllw, and por
> to handle specific vector shuffle
On Wed, May 8, 2024 at 4:44 AM Levy Hsu wrote:
>
> PR target/107563
>
> gcc/ChangeLog:
>
> * config/i386/i386-expand.cc (expand_vec_perm_psrlw_psllw_por): New
> subroutine.
> (ix86_expand_vec_perm_const_1): New Entry.
>
> gcc/testsuite/ChangeLog:
>
> *
On Mon, May 6, 2024 at 5:20 AM Hongtao Liu wrote:
>
> CC uros.
>
> On Mon, May 6, 2024 at 11:03 AM Kong, Lingling
> wrote:
> >
> > Hi,
> > (if_then_else:SI (eq (reg:CCZ 17 flags)
> > (const_int 0 [0]))
> > (reg/v:SI 101 [ e ])
> > (reg:SI 102))
> > The cost is 8 for the rtx, the
On Sun, Apr 28, 2024 at 7:47 AM liuhongt wrote:
>
> So when both source operand and dest operand require avx512 MASK_REGS, RA
> can allocate MASK_REGS register instead of GPR to avoid reload it from
> GPR to MASK_REGS.
> It's similar as what did for logic patterns.
>
> Bootstrapped and regtested
On Fri, Apr 26, 2024 at 11:03 AM Haochen Jiang wrote:
>
> Hi all,
>
> The array index should not be over 8 for v8hi, or it will fail
> under -O0 or using -fstack-protector.
>
> This patch aims to fix that, which is mentioned in PR110621.
>
> Commit as obvious and backport to GCC13.
>
> Thx,
>
https://gcc.gnu.org/g:624c3bb9ff762f196852dc77233610d1cdf7d7be
commit r11-11351-g624c3bb9ff762f196852dc77233610d1cdf7d7be
Author: Jakub Jelinek
Date: Fri Mar 22 09:23:44 2024 +0100
ubsan: Don't -fsanitize=null instrument __seg_fs/gs pointers [PR111736]
On x86 and avr some address
https://gcc.gnu.org/g:09910b6753427eeb3f6dded4fae3578851da7422
commit r11-11352-g09910b6753427eeb3f6dded4fae3578851da7422
Author: Jakub Jelinek
Date: Tue Mar 26 11:06:15 2024 +0100
tsan: Don't instrument non-generic AS accesses [PR111736]
Similar to the asan and ubsan changes, we
https://gcc.gnu.org/g:b4e1aee01a2fa617cf74ab04cf0ab574761aaaea
commit r11-11350-gb4e1aee01a2fa617cf74ab04cf0ab574761aaaea
Author: Richard Biener
Date: Thu Mar 21 08:30:39 2024 +0100
tree-optimization/111736 - avoid address sanitizing of __seg_gs
The following more thoroughly
https://gcc.gnu.org/g:b86b523fb53f5ffb0e3f3236fc526a587944d9ea
commit r11-11349-gb86b523fb53f5ffb0e3f3236fc526a587944d9ea
Author: Richard Biener
Date: Tue Dec 5 14:00:43 2023 +0100
sanitizer/111736 - skip ASAN for globals in alternate address-space
gcc/ChangeLog:
On Tue, Apr 23, 2024 at 5:50 PM Jakub Jelinek wrote:
>
> Hi!
>
> As discussed in the PR, on ia32 with its 8 GPRs, where 1 is always fixed
> and other 2 often are as well having an alternative which needs 3
> double-word registers is just too much for RA.
> The following patch splits that
https://gcc.gnu.org/g:48fd1c5791b47717dcd4fa5615bc07cf54e964a7
commit r12-10390-g48fd1c5791b47717dcd4fa5615bc07cf54e964a7
Author: Jakub Jelinek
Date: Tue Mar 26 11:06:15 2024 +0100
tsan: Don't instrument non-generic AS accesses [PR111736]
Similar to the asan and ubsan changes, we
https://gcc.gnu.org/g:e89b5ed62a5a06fb8918ffa1616f0f37c8d359c3
commit r12-10388-ge89b5ed62a5a06fb8918ffa1616f0f37c8d359c3
Author: Richard Biener
Date: Thu Mar 21 08:30:39 2024 +0100
tree-optimization/111736 - avoid address sanitizing of __seg_gs
The following more thoroughly
https://gcc.gnu.org/g:d6c62e4fb9a6d395599b7c78c831bace4bc7ff8f
commit r12-10389-gd6c62e4fb9a6d395599b7c78c831bace4bc7ff8f
Author: Jakub Jelinek
Date: Fri Mar 22 09:23:44 2024 +0100
ubsan: Don't -fsanitize=null instrument __seg_fs/gs pointers [PR111736]
On x86 and avr some address
https://gcc.gnu.org/g:61d1962e7c3c32da6962d9cb20f6fd996501f3f2
commit r12-10387-g61d1962e7c3c32da6962d9cb20f6fd996501f3f2
Author: Richard Biener
Date: Tue Dec 5 14:00:43 2023 +0100
sanitizer/111736 - skip ASAN for globals in alternate address-space
PR sanitizer/111736
On Tue, Apr 16, 2024 at 5:52 AM Alexandre Oliva wrote:
>
>
> Without -msse2, an i586-targeting toolchain fails bf16_short_warn.c
> because neither type __m128bh nor intrinsic _mm_cvtneps_pbh get
> declared.
>
> Regstrapped on x86_64-linux-gnu. Also tested with gcc-13 on arm-,
> aarch64-, x86-
On Tue, Apr 16, 2024 at 5:51 AM Alexandre Oliva wrote:
>
>
> A few x86 tests get unexpected insn counts if the toolchain is
> configured with --enable-frame-pointer. Add explicit
> -fomit-frame-pointer so that the expected insn sequences are output.
>
> Regstrapped on x86_64-linux-gnu. Also
On Thu, Apr 11, 2024 at 4:02 PM Segher Boessenkool
wrote:
>
> On Wed, Apr 10, 2024 at 08:32:39PM +0200, Uros Bizjak wrote:
> > On Wed, Apr 10, 2024 at 7:56 PM Segher Boessenkool
> > wrote:
> > > This is never okay. You cannot commit a patch without approval, *eve
On Wed, Apr 10, 2024 at 7:56 PM Segher Boessenkool
wrote:
>
> On Sun, Apr 07, 2024 at 08:31:38AM +0200, Uros Bizjak wrote:
> > If there are no further comments, I plan to commit the referred patch
> > to the mainline on Wednesday. The latest version can be considered an
https://gcc.gnu.org/g:eaccdba315b86d374a4e72b9dd8fefb0fc3cc5ee
commit r14-9847-geaccdba315b86d374a4e72b9dd8fefb0fc3cc5ee
Author: Uros Bizjak
Date: Mon Apr 8 20:54:30 2024 +0200
combine: Fix ICE in try_combine on pr112494.c [PR112560]
The compiler, configured with --enable
On Mon, Apr 1, 2024 at 9:28 PM Uros Bizjak wrote:
> I'd like to ping the
> https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647634.html
> PR112560 P1 patch.
If there are no further comments, I plan to commit the referred patch
to the mainline on Wednesday. The latest ve
On Fri, Apr 5, 2024 at 5:56 PM H.J. Lu wrote:
>
> Don't use implicit shift count in double-precision shifts in AT syntax
> since they aren't in Intel SDM. Keep the 's' modifier for backward
> compatibility with inline asm statements.
>
> PR target/114590
> * config/i386/i386.md
On Thu, Apr 4, 2024 at 5:08 PM H.J. Lu wrote:
>
> Define __APX_F__ when APX is enabled.
>
> gcc/
>
> PR target/114587
> * config/i386/i386-c.cc (ix86_target_macros_internal): Define
> __APX_F__ when APX is enabled.
>
> gcc/testsuite/
>
> PR target/114587
>
Hello!
I'd like to ping the
https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647634.html
PR112560 P1 patch.
Thanks,
Uros.
On Wed, Mar 27, 2024 at 11:48 AM Jakub Jelinek wrote:
>
> Hi!
>
> These tests FAIL for quite a while on i686-linux since July last year,
> likely r14-2628 . Since that patch gcc claims _Float16 and __bf16
> support even without -msse2 because some functions could be using
> target attribute.
>
On Thu, Mar 21, 2024 at 10:26 AM Rainer Orth
wrote:
>
> Two avx512cd tests FAIL to assemble with the Solaris/x86 assembler:
>
> FAIL: gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c (test for excess errors)
> UNRESOLVED: gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c compilation failed
> to produce
https://gcc.gnu.org/g:f6ed0466d40de496b14225fae44acf618dac1fd2
commit r12-10284-gf6ed0466d40de496b14225fae44acf618dac1fd2
Author: Uros Bizjak
Date: Tue Mar 19 16:57:50 2024 +0100
testsuite/i386: Correct pr111822.C dg-do options [PR111822]
PR target/111822
gcc
https://gcc.gnu.org/g:1a6d04fce7d78b9e5201333be0c0877390f81bc3
commit r13-8466-g1a6d04fce7d78b9e5201333be0c0877390f81bc3
Author: Uros Bizjak
Date: Tue Mar 19 16:56:11 2024 +0100
i386: Unify {general,timode}_scalar_chain::convert_op [PR111822]
Recent PR111822 fix implemented
Recent PR111822 fix implemented REG_EH_REGION note copying to a STV converted
preload instruction in general_scalar_chain::convert_op. However, the same
issue remains in timode_scalar_chain::convert_op. Instead of copying the
newly introduced code to timode_scalar_chain::convert_op, the patch
https://gcc.gnu.org/g:b96c5436880d7926299314a33c953171082ab59e
commit r14-9523-gb96c5436880d7926299314a33c953171082ab59e
Author: Uros Bizjak
Date: Mon Mar 18 20:40:29 2024 +0100
i386: Unify {general,timode}_scalar_chain::convert_op [PR111822]
Recent PR111822 fix implemented
On Mon, Mar 18, 2024 at 3:51 PM Segher Boessenkool
wrote:
>
> On Thu, Mar 07, 2024 at 11:46:54PM +0100, Uros Bizjak wrote:
> > > Can't you just describe the dataflow then, without an unspec? An unspec
> > > by definition does some (unspecified) operation on the
On Mon, Mar 18, 2024 at 3:46 PM Segher Boessenkool
wrote:
>
> On Thu, Mar 07, 2024 at 11:27:28PM +0100, Uros Bizjak wrote:
> > On Thu, Mar 7, 2024 at 11:07 PM Uros Bizjak wrote:
> > > > > (unspec:DI [
> > > > > (reg:C
On Mon, Mar 18, 2024 at 11:52 AM liuhongt wrote:
>
> Commit r14-9459-g618e34d56cc38e only handles
> general_scalar_chain::convert_op. The patch also handles
> timode_scalar_chain::convert_op to avoid potential similar bug.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for
On Fri, Mar 15, 2024 at 9:50 AM Jakub Jelinek wrote:
>
> Hi!
>
> In r13-3803-gfa271afb58 I've added an optimization for LE/LEU/GE/GEU
> comparison against CONST_VECTOR. As the comments say:
> /* x <= cst can be handled as x < cst + 1 unless there is
> wrap around in cst + 1.
On Thu, Mar 14, 2024 at 8:42 AM Uros Bizjak wrote:
>
> On Thu, Mar 14, 2024 at 8:32 AM Hongtao Liu wrote:
> >
> > On Thu, Mar 14, 2024 at 3:22 PM Uros Bizjak wrote:
> > >
> > > On Thu, Mar 14, 2024 at 2:33 AM liuhongt wrote:
> > > >
> > &g
On Thu, Mar 14, 2024 at 8:32 AM Hongtao Liu wrote:
>
> On Thu, Mar 14, 2024 at 3:22 PM Uros Bizjak wrote:
> >
> > On Thu, Mar 14, 2024 at 2:33 AM liuhongt wrote:
> > >
> > > When we split
> > > (insn 37 36 38 10 (set (reg:DI 104 [ _18 ])
> > &
On Thu, Mar 14, 2024 at 2:33 AM liuhongt wrote:
>
> When we split
> (insn 37 36 38 10 (set (reg:DI 104 [ _18 ])
> (mem:DI (reg/f:SI 98 [ CallNative_nclosure.0_1 ]) [6 MEM[(struct
> SQRefCounted *)CallNative_nclosure.0_1]._uiRef+0 S8 A32])) "test.C":22:42 84
> {*movdi_internal}
>
Forgot to CC gcc-patches@ ML... sorry for the duplicate...
The compiler, configured with --enable-checking=yes,rtl,extra ICEs with:
internal compiler error: RTL check: expected elt 0 type 'e' or 'u',
have 'E' (rtx unspec) in try_combine, at combine.cc:3237
This is
3236 /* Just
On Thu, Mar 7, 2024 at 11:29 PM Segher Boessenkool
wrote:
>
> On Thu, Mar 07, 2024 at 11:07:18PM +0100, Uros Bizjak wrote:
> > On Thu, Mar 7, 2024 at 10:37 PM Segher Boessenkool
> > wrote:
> > > > but can be something else, such as the above not
On Thu, Mar 7, 2024 at 11:07 PM Uros Bizjak wrote:
>
> On Thu, Mar 7, 2024 at 10:37 PM Segher Boessenkool
> wrote:
> >
> > On Thu, Mar 07, 2024 at 10:04:32PM +0100, Uros Bizjak wrote:
> >
> > [snip]
> >
> > > The part we want to fix deals with the
On Thu, Mar 7, 2024 at 10:37 PM Segher Boessenkool
wrote:
>
> On Thu, Mar 07, 2024 at 10:04:32PM +0100, Uros Bizjak wrote:
>
> [snip]
>
> > The part we want to fix deals with the *user* of the CC register. It
> > is not true that this is always COMPARISON_P, so EQ, NE,
On Thu, Mar 7, 2024 at 10:04 PM Uros Bizjak wrote:
> The source code that deals with the *user* of the CC register assumes
> the former form, so it blindly tries to update the mode of the CC
> register inside LT comparison RTX (some other nearby source code even
> checks for (cons
On Thu, Mar 7, 2024 at 6:39 PM Segher Boessenkool
wrote:
>
> On Thu, Mar 07, 2024 at 10:55:12AM +0100, Richard Biener wrote:
> > On Thu, 7 Mar 2024, Uros Bizjak wrote:
> > > This is
> > >
> > > 3236 /* Just replace the CC reg with a new mode.
On Thu, Mar 7, 2024 at 12:11 PM Richard Biener wrote:
>
> On Thu, 7 Mar 2024, Jakub Jelinek wrote:
>
> > On Thu, Mar 07, 2024 at 11:11:35AM +0100, Uros Bizjak wrote:
> > > > Since you CCed me - looking at the code I wonder why we fatally fail.
> > > >
On Thu, Mar 7, 2024 at 11:37 AM Jakub Jelinek wrote:
>
> On Thu, Mar 07, 2024 at 11:11:35AM +0100, Uros Bizjak wrote:
> > > Since you CCed me - looking at the code I wonder why we fatally fail.
> > > The following might also fix the issue and preserve more of th
On Thu, Mar 7, 2024 at 10:56 AM Richard Biener wrote:
>
> On Thu, 7 Mar 2024, Uros Bizjak wrote:
>
> > The compiler, configured with --enable-checking=yes,rtl,extra ICEs with:
> >
> > internal compiler error: RTL check: expected elt 0 type 'e' or 'u',
> > hav
The compiler, configured with --enable-checking=yes,rtl,extra ICEs with:
internal compiler error: RTL check: expected elt 0 type 'e' or 'u',
have 'E' (rtx unspec) in try_combine, at combine.cc:3237
This is
3236 /* Just replace the CC reg with a new mode. */
3237 SUBST
optimize_function_for_size_p predicate is not stable during optab selection,
because it also depends on node->count/node->frequency of the current function,
which are updated during IPA, so they may change between early opts and
late opts. Use optimize_size instead - optimize_size implies
https://gcc.gnu.org/g:74e8cc28eda9b1d75588fcd4017a735911b9d2b4
commit r14-9346-g74e8cc28eda9b1d75588fcd4017a735911b9d2b4
Author: Uros Bizjak
Date: Wed Mar 6 20:53:50 2024 +0100
i386: Fix and improve insn constraint for V2QI arithmetic/shift insns
optimize_function_for_size_p
Eliminate common code from x86_32 TARGET_MACHO part in ix86_expand_move and
use generic code instead.
No functional changes.
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_move) [TARGET_MACHO]:
Eliminate common code and use generic code instead.
Bootstrapped and regression
https://gcc.gnu.org/g:e772c0c05c36d0b0539effb4256be67bbedd77fb
commit r14-9338-ge772c0c05c36d0b0539effb4256be67bbedd77fb
Author: Uros Bizjak
Date: Wed Mar 6 17:08:25 2024 +0100
i386: Eliminate common code from x86_32 TARGET_MACHO part in
ix86_expand_move
Eliminate common code
On Wed, Mar 6, 2024 at 9:10 AM Jakub Jelinek wrote:
>
> Hi!
>
> When writing the rest_of_handle_insert_vzeroupper workaround to manually
> remove all the REG_DEAD/REG_UNUSED notes from the IL, I've missed that
> there is a df_analyze () call right after it and that the problems added
> earlier in
On Mon, Mar 4, 2024 at 9:41 AM Jakub Jelinek wrote:
>
> On Mon, Mar 04, 2024 at 09:34:30AM +0100, Uros Bizjak wrote:
> > > --- gcc/config/i386/i386-expand.cc.jj 2024-03-01 14:56:34.120925989
> > > +0100
> > > +++ gcc/config/i386/i386-expand.cc 2024-03-0
On Mon, Mar 4, 2024 at 9:25 AM Jakub Jelinek wrote:
>
> Hi!
>
> The Intel extended format has the various weird number categories,
> pseudo denormals, pseudo infinities, pseudo NaNs and unnormals.
> Those are not representable in the GCC real_value and so neither
> GIMPLE nor RTX
umuldi3_highpart expander does:
if (REG_P (operands[2]))
operands[2] = gen_rtx_ZERO_EXTEND (TImode, operands[2]);
on register_operand predicate, which also allows SUBREG RTX. So,
subregs were emitted without ZERO_EXTEND RTX.
But nowadays we have UMUL_HIGHPART that allows us to fix this
Also handle V2BF mode.
PR target/113871
gcc/ChangeLog:
* config/i386/mmx.md (V248FI): Add V2BF mode.
(V24FI_32): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr113871-5a.c: New test.
* gcc.target/i386/pr113871-5b.c: New test.
Bootstrapped and regression tested on
On Mon, Feb 26, 2024 at 10:33 AM Jakub Jelinek wrote:
>
> Hi!
>
> I'd like to ping 2 patches:
> https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645326.html
> i386: Enable _BitInt support on ia32
>
> all the FAILs mentioned in that mail have been fixed by now.
LGTM, based on HJ's advice.
On Sun, Feb 25, 2024 at 10:14 PM H.J. Lu wrote:
>
> ix86_set_func_type checks noreturn attribute to avoid incompatible
> attribute error in LTO1 on interrupt functions. Since TREE_THIS_VOLATILE
> is set also for _Noreturn without noreturn attribute, check interrupt
> attribute for interrupt
On Sun, Feb 25, 2024 at 5:01 PM H.J. Lu wrote:
>
> ix86_set_func_type checks noreturn attribute to avoid incompatible
> attribute error in LTO1 on interrupt functions. Since TREE_THIS_VOLATILE
> is set also for _Noreturn without noreturn attribute, check interrupt
> attribute for interrupt
On Fri, Feb 23, 2024 at 3:45 AM H.J. Lu wrote:
>
> On Thu, Feb 22, 2024 at 6:39 PM Hongtao Liu wrote:
> >
> > On Thu, Feb 22, 2024 at 10:33 PM H.J. Lu wrote:
> > >
> > > On Sun, Feb 18, 2024 at 8:02 AM H.J. Lu wrote:
> > > >
> > > > If assembler and linker supports
> > > >
> > > > add %reg1,
A compile-time test can use -march=skylake-avx512 for all x86 targets,
but a runtime test needs to check avx512f effective target if the
instructions can be assembled.
The runtime test also needs to check if the target machine supports
instruction set we have been compiled for. The testsuite
Introduce vec_shl_ and vec_shr_ expanders to improve
'*a = __builtin_shufflevector(*a, (vect64){0}, 1, 2, 3, 4);'
and
'*a = __builtin_shufflevector((vect64){0}, *a, 3, 4, 5, 6);'
shuffles. The generated code improves from:
movzwl 6(%rdi), %eax
movzwl 4(%rdi), %edx
salq
On Mon, Feb 5, 2024 at 5:43 PM H.J. Lu wrote:
>
> Changes in v6:
>
> 1. Use ix86_save_reg and accessible_reg_set in
> x86_64_select_profile_regnum.
> 2. Construct a complete reg name in x86_function_profiler.
>
> Changes in v5:
>
> 1. Add pr113689-3.c.
> 2. Use %r10 if
On Fri, Feb 2, 2024 at 11:47 PM H.J. Lu wrote:
>
> Changes in v5:
>
> 1. Add pr113689-3.c.
> 2. Use %r10 if ix86_profile_before_prologue () return true.
> 3. Try a callee-saved register which has been saved on stack in the
> prologue.
>
> Changes in v4:
>
> 1. Remove pr113689-3.c.
> 2. Use
On Mon, Feb 5, 2024 at 9:06 AM Uros Bizjak wrote:
>
> On Mon, Feb 5, 2024 at 1:24 AM Roger Sayle wrote:
> >
> >
> > This patch fixes PR target/113690, an ICE-on-valid regression on x86_64
> > that exhibits with a specific combination of command line options. The
On Wed, Jan 31, 2024 at 9:23 AM Jakub Jelinek wrote:
>
> Hi!
>
> The move of the vzeroupper pass from after reload pass to after
> postreload_cse helped only partially, CSE-like passes can still invalidate
> those notes (especially REG_UNUSED) if they use some earlier register
> holding some
On Mon, Feb 5, 2024 at 1:24 AM Roger Sayle wrote:
>
>
> This patch fixes PR target/113690, an ICE-on-valid regression on x86_64
> that exhibits with a specific combination of command line options. The
> cause is that x86's scalar-to-vector pass converts a chain of instructions
> from TImode to
On Fri, Feb 2, 2024 at 9:59 AM Rainer Orth
wrote:
>
> gcc.target/i386/pr71321.c FAILs on 64-bit Solaris/x86 with the native
> assembler:
>
> FAIL: gcc.target/i386/pr71321.c scan-assembler-not lea.*0
>
> The problem is that /bin/as doesn't fully support cfi directives, so the
> .eh_frame section
The fix for PR70321 introduced a splitter that split a doubleword
comparison into a pair of XORs followed by an IOR to set the (zero)
flags register. To help the reload, splitter forced SUBREG pieces of
double-word input values to a pseudo, but this regressed
gcc.target/i386/pr82580.c
int f0 (U
On Thu, Feb 1, 2024 at 3:18 PM Richard Biener wrote:
>
> The following avoids re-using a register holding a pointer (and
> thus might be REG_POINTER) for the result of a pointer difference
> computation. That might confuse heuristics in (broken) RTL alias
> analysis which relies on REG_POINTER
On Wed, Jan 31, 2024 at 1:57 PM Rainer Orth
wrote:
>
> The gcc.target/i386/no-callee-saved-[12].c tests FAIL on Solaris/x86:
>
> FAIL: gcc.target/i386/no-callee-saved-1.c scan-assembler-not push
> FAIL: gcc.target/i386/no-callee-saved-2.c scan-assembler-not push
>
> In both cases, the test
On Wed, Jan 31, 2024 at 2:02 PM Rainer Orth
wrote:
>
> The gcc.target/i386/pr38534-1.c etc. tests FAIL on 32 and 64-bit
> Solaris/x86:
>
> FAIL: gcc.target/i386/pr38534-1.c scan-assembler-not push
> FAIL: gcc.target/i386/pr38534-2.c scan-assembler-not push
> FAIL: gcc.target/i386/pr38534-3.c
On Wed, Jan 31, 2024 at 3:04 PM Rainer Orth
wrote:
>
> Three patches have remained unreviewed for a week or more:
>
> c++: Fix g++.dg/ext/attr-section2.C etc. with Solaris/SPARC as
> https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643434.html
>
> This one may even be
On Wed, Jan 24, 2024 at 10:07 AM Rainer Orth
wrote:
>
> gcc.target/i386/pr80833-1.c FAILs on 32-bit Solaris/x86 since 20220609:
>
> FAIL: gcc.target/i386/pr80833-1.c scan-assembler pextrd
>
> Unlike e.g. Linux/i686, 32-bit Solaris/x86 defaults to -mstackrealign,
> so this patch overrides that to
On Fri, Jan 19, 2024 at 5:50 PM Jeff Law wrote:
>
>
>
> On 1/19/24 09:05, Georg-Johann Lay wrote:
> >
> >
> > Am 18.01.24 um 20:54 schrieb Roger Sayle:
> >>
> >> This patch tweaks RTL expansion of multi-word shifts and rotates to use
> >> PLUS rather than IOR for disjunctive operations. During
On Thu, Jan 18, 2024 at 8:31 AM Jakub Jelinek wrote:
>
> Hi!
>
> x86_function_profiler emits assembly directly into file and only emits
> AT syntax. The following patch adjusts it to emit MASM syntax
> if -masm=intel.
> As it doesn't use asm_fprintf, I can't use {|} syntax for the dialects.
>
>
On Thu, Jan 11, 2024 at 7:24 PM Fangrui Song wrote:
>
> Printing the raw symbol is useful in inline asm (e.g. in C++ to get the
> mangled name). Similar constraints are available in other targets (e.g.
> "S" for aarch64/riscv, "Cs" for m68k).
>
> There isn't a good way for x86 yet, e.g. "i"
On Thu, Jan 11, 2024 at 9:33 AM Fangrui Song wrote:
>
> On 2024-01-11, Uros Bizjak wrote:
> >On Thu, Jan 11, 2024 at 4:44 AM Fangrui Song wrote:
> >>
> >> Printing the raw symbol is useful in inline asm (e.g. in C++ to get the
> >> mangled name). Sim
On Thu, Jan 11, 2024 at 4:44 AM Fangrui Song wrote:
>
> Printing the raw symbol is useful in inline asm (e.g. in C++ to get the
> mangled name). Similar constraints are available in other targets (e.g.
> "S" for aarch64/riscv, "Cs" for m68k).
>
> There isn't a good way for x86 yet, e.g. "i"
On Tue, Jan 9, 2024 at 11:19 AM Uros Bizjak wrote:
>
> On Tue, Jan 9, 2024 at 11:06 AM Richard Biener wrote:
> >
> > On Tue, 9 Jan 2024, Uros Bizjak wrote:
> >
> > > On Tue, Jan 9, 2024 at 10:44?AM Richard Biener wrote:
> > > >
On Tue, Jan 9, 2024 at 11:06 AM Richard Biener wrote:
>
> On Tue, 9 Jan 2024, Uros Bizjak wrote:
>
> > On Tue, Jan 9, 2024 at 10:44?AM Richard Biener wrote:
> > >
> > > On Tue, 9 Jan 2024, Uros Bizjak wrote:
> > >
> > > >
On Tue, Jan 9, 2024 at 10:44 AM Richard Biener wrote:
>
> On Tue, 9 Jan 2024, Uros Bizjak wrote:
>
> > On Tue, Jan 9, 2024 at 9:58?AM Richard Biener wrote:
> > >
> > > On Mon, 8 Jan 2024, Uros Bizjak wrote:
> > >
> > > &g
On Tue, Jan 9, 2024 at 9:58 AM Richard Biener wrote:
>
> On Mon, 8 Jan 2024, Uros Bizjak wrote:
>
> > On Mon, Jan 8, 2024 at 5:57?PM Andrew Pinski wrote:
> > >
> > > On Mon, Jan 8, 2024 at 6:44?AM Uros Bizjak wrote:
> > > >
> > > > Instea
On Mon, Jan 8, 2024 at 5:57 PM Andrew Pinski wrote:
>
> On Mon, Jan 8, 2024 at 6:44 AM Uros Bizjak wrote:
> >
> > Instead of converting XOR or PLUS of two values, ANDed with two constants
> > that
> > have no bits in common, to IOR expression, convert IOR or XO
Instead of converting XOR or PLUS of two values, ANDed with two constants that
have no bits in common, to IOR expression, convert IOR or XOR of said two
ANDed values to PLUS expression.
If we consider the following testcase:
--cut here--
unsigned int foo (unsigned int a, unsigned int b)
{
On Mon, Jan 8, 2024 at 10:56 AM Richard Biener wrote:
>
> It was noticed that -mmovbe doesn't use movbe for __builtin_bswap{32,64}
> when not optimizing. The follownig adjusts the documentation to
> say it will be used for optimizing and applies to all byte swaps,
> not just those carried out
On Sat, Jan 6, 2024 at 2:30 PM Roger Sayle wrote:
>
>
> This patch improves the cost/gain calculation used during the i386 backend's
> SImode/DImode scalar-to-vector (STV) conversion pass. The current code
> handles loads and stores, but doesn't consider that converting other
> scalar operations
Hello!
I have sent an explanation on ICE in try_combine on pr112494.c [1],and
an argument that explains why we can safely ignore non-COMPARISON_P
mode changes [2].
Can we proceed with the proposed solution?
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638726.html
[2]
1 - 100 of 6309 matches
Mail list logo