On Tue, Apr 30, 2024 at 3:38 PM Jakub Jelinek wrote:
>
> On Tue, Apr 30, 2024 at 09:30:00AM +0200, Richard Biener wrote:
> > On Mon, Apr 29, 2024 at 5:30 PM H.J. Lu wrote:
> > >
> > > On Mon, Apr 29, 2024 at 6:47 AM liuhongt wrote:
> > > >
> > > > The Fortran standard does not specify what the
On Wed, Apr 24, 2024 at 1:46 PM Haochen Jiang wrote:
>
> Hi all,
>
> When we are using -mavx10.1-256 in command line and avx10.1-256 in
> target attribute together, zmm should never be generated. But current
> GCC will generate zmm since it wrongly enables EVEX512 for non-explicitly
> set AVX512.
On Sat, Apr 13, 2024 at 6:42 AM H.J. Lu wrote:
>
> The x86 instruction size limit is 15 bytes. If a NDD instruction has
> a segment prefix byte, a 4-byte opcode prefix, a MODRM byte, a SIB byte,
> a 4-byte displacement and a 4-byte immediate, adding an address size
> prefix will exceed the size
On Tue, Apr 9, 2024 at 3:05 PM Hongyu Wang wrote:
>
> The latest APX spec announced removal of SHA/KEYLOCKER evex promotion [1],
> which means the SHA/KEYLOCKER insn does not support EGPR when APX
> enabled. Update the corresponding constraints to their EGPR-disabled
> counterparts.
>
>
On Tue, Apr 9, 2024 at 5:18 PM Jakub Jelinek wrote:
>
> On Tue, Apr 09, 2024 at 11:23:40AM +0800, Hongtao Liu wrote:
> > I think we can merge alternative 2 with 3 to
> > * return TARGET_AES ? \"vaesenc\t{%2, %1, %0|%0, %1, %2}"\" :
> > \&q
On Thu, Apr 4, 2024 at 4:42 PM Jakub Jelinek wrote:
>
> On Wed, Apr 19, 2023 at 02:40:59AM +, Jiang, Haochen via Gcc-patches
> wrote:
> > > > (define_insn "aesenc"
> > > > - [(set (match_operand:V2DI 0 "register_operand" "=x,x")
> > > > - (unspec:V2DI [(match_operand:V2DI 1
On Tue, Apr 9, 2024 at 9:58 AM H.J. Lu wrote:
>
> Define __APX_INLINE_ASM_USE_GPR32__ for -mapx-inline-asm-use-gpr32.
> When __APX_INLINE_ASM_USE_GPR32__ is defined, inline asm statements
> should contain only instructions compatible with r16-r31.
Ok.
>
> gcc/
>
> PR target/114587
>
On Mon, Apr 8, 2024 at 11:44 PM H.J. Lu wrote:
>
> Define following macros for APX options:
>
> 1. __APX_EGPR__: -mapx-features=egpr.
> 2. __APX_PUSH2POP2__: -mapx-features=push2pop2.
> 3. __APX_NDD__: -mapx-features=ndd.
> 4. __APX_PPX__: -mapx-features=ppx.
For -mapx-features=, we haven't
On Tue, Mar 26, 2024 at 11:26 AM Hongtao Liu wrote:
>
> On Mon, Mar 25, 2024 at 8:51 PM Jakub Jelinek wrote:
> >
> > On Tue, Mar 12, 2024 at 07:57:59PM +0800, liuhongt wrote:
> > > if alignb > ASAN_RED_ZONE_SIZE and offset[0] is not multiple of
> > > alig
On Mon, Mar 25, 2024 at 8:51 PM Jakub Jelinek wrote:
>
> On Tue, Mar 12, 2024 at 07:57:59PM +0800, liuhongt wrote:
> > if alignb > ASAN_RED_ZONE_SIZE and offset[0] is not multiple of
> > alignb. (base_align_bias - base_offset) may not aligned to alignb, and
> > caused segement fault.
> >
> >
On Tue, Mar 19, 2024 at 12:16 AM Joseph Myers wrote:
>
> On Mon, 18 Mar 2024, liuhongt wrote:
>
> > +If @option{-fexcess-precision=16} is specified, casts and assignments of
> > +@code{_Float16} and @code{bfloat16_t} cause value to be rounded to their
> > +semantic types if they're supported by
On Mon, Mar 18, 2024 at 6:59 PM Uros Bizjak wrote:
>
> On Mon, Mar 18, 2024 at 11:52 AM liuhongt wrote:
> >
> > Commit r14-9459-g618e34d56cc38e only handles
> > general_scalar_chain::convert_op. The patch also handles
> > timode_scalar_chain::convert_op to avoid potential similar bug.
> >
> >
On Thu, Mar 14, 2024 at 11:42 PM Andrew Stubbs wrote:
>
> Don't enable excess lanes when inverting vector bit-masks smaller than the
> integer mode. This is yet another case of wrong-code due to mishandling
> of oversized bitmasks.
>
> This issue shows up in vect/tsvc/vect-tsvc-s278.c and
>
On Thu, Mar 14, 2024 at 10:46 PM Uros Bizjak wrote:
>
> On Thu, Mar 14, 2024 at 8:42 AM Uros Bizjak wrote:
> >
> > On Thu, Mar 14, 2024 at 8:32 AM Hongtao Liu wrote:
> > >
> > > On Thu, Mar 14, 2024 at 3:22 PM Uros Bizjak wrote:
> > > >
> &g
On Thu, Mar 14, 2024 at 3:22 PM Uros Bizjak wrote:
>
> On Thu, Mar 14, 2024 at 2:33 AM liuhongt wrote:
> >
> > When we split
> > (insn 37 36 38 10 (set (reg:DI 104 [ _18 ])
> > (mem:DI (reg/f:SI 98 [ CallNative_nclosure.0_1 ]) [6 MEM[(struct
> > SQRefCounted
On Tue, Mar 12, 2024 at 8:00 PM liuhongt wrote:
>
> if alignb > ASAN_RED_ZONE_SIZE and offset[0] is not multiple of
> alignb. (base_align_bias - base_offset) may not aligned to alignb, and
> caused segement fault.
>
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> Ok for trunk and
On Thu, Feb 29, 2024 at 2:20 PM Hongtao Liu wrote:
>
> On Wed, Feb 28, 2024 at 4:54 PM Jakub Jelinek wrote:
> >
> > Hi!
> >
> > Adding Hongtao and Honza into the loop as the ones who acked the original
> > patch.
> >
> > The no_callee_saved_regist
On Wed, Feb 28, 2024 at 4:54 PM Jakub Jelinek wrote:
>
> Hi!
>
> Adding Hongtao and Honza into the loop as the ones who acked the original
> patch.
>
> The no_callee_saved_registers by default for noreturn functions change can
> break in-process backtrace(3) or backtraces from debugger or other
On Tue, Feb 27, 2024 at 3:44 PM Richard Biener wrote:
>
> On Tue, 27 Feb 2024, haochen.jiang wrote:
>
> > On Linux/x86_64,
> >
> > af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
> > commit af66ad89e8169f44db723813662917cf4cbb78fc
> > Author: Richard Biener
> > Date: Fri Feb
On Mon, Feb 26, 2024 at 6:30 PM H.J. Lu wrote:
>
> On Sun, Feb 25, 2024 at 8:25 PM H.J. Lu wrote:
> >
> > On Sun, Feb 25, 2024 at 7:03 PM Hongtao Liu wrote:
> > >
> > > On Mon, Feb 26, 2024 at 10:37 AM H.J. Lu wrote:
> > > >
> >
MODE_NATURAL_SIZE (imode);
>
> Pan
>
> -Original Message-
> From: Hongtao Liu
> Sent: Monday, February 26, 2024 11:41 AM
> To: Li, Pan2
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com;
> richard.guent...@gmail.com; Wang, Yanzhang ;
> rda
On Mon, Feb 26, 2024 at 11:26 AM wrote:
>
> From: Pan Li
>
> We allowed vector type for get_stored_val when read is less than or
> equal to store in previous. Unfortunately, we missed to adjust the
> validate_subreg part accordingly. For vector type, we don't need to
> restrict the mode size
On Mon, Feb 26, 2024 at 10:37 AM H.J. Lu wrote:
>
> On Sun, Feb 25, 2024 at 6:03 PM Hongtao Liu wrote:
> >
> > On Mon, Feb 26, 2024 at 5:11 AM H.J. Lu wrote:
> > >
> > > ldtilecfg and sttilecfg take a 512-byte memory block. With
> > > _tile_loadconf
On Mon, Feb 26, 2024 at 5:11 AM H.J. Lu wrote:
>
> ldtilecfg and sttilecfg take a 512-byte memory block. With
> _tile_loadconfig implemented as
>
> extern __inline void
> __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> _tile_loadconfig (const void *__config)
> {
> __asm__
On Thu, Feb 22, 2024 at 10:33 PM H.J. Lu wrote:
>
> On Sun, Feb 18, 2024 at 8:02 AM H.J. Lu wrote:
> >
> > If assembler and linker supports
> >
> > add %reg1, name@gottpoff(%rip), %reg2
> >
> > with R_X86_64_CODE_6_GOTTPOFF, we can generate it instead of
> >
> > mov name@gottpoff(%rip), %reg2
>
On Wed, Feb 14, 2024 at 5:33 AM H.J. Lu wrote:
>
> Since push2/pop2 requires 16-byte stack alignment, don't generate them
> if the incoming stack isn't 16-byte aligned.
Ok.
>
> gcc/
>
> PR target/113912
> * config/i386/i386.cc (ix86_can_use_push2pop2): New.
>
On Tue, Feb 6, 2024 at 11:49 AM H.J. Lu wrote:
>
> 1. The only supported TLS code sequence with ADD is
>
> addq foo@gottpoff(%rip),%reg
>
> Change je constraint to a memory operand in APX NDD ADD pattern with
> register source operand.
>
> 2. The instruction length of APX NDD instructions
s been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures. Ok for mainline (in stage 1)?
Ok, thanks for handling this.
>
>
> 2024-01-25 Roger Sayle
> Hongtao Liu
>
>
On Tue, Jan 23, 2024 at 11:00 PM H.J. Lu wrote:
>
> Changes in v3:
>
> 1. Rebase against commit 02e68389494
> 2. Don't add call_no_callee_saved_registers to machine_function since
> all callee-saved registers are properly clobbered by callee with
> no_callee_saved_registers attribute.
>
The patch
On Mon, Jan 22, 2024 at 10:31 AM Haochen Jiang wrote:
>
> Hi all,
>
> Recently, I happened to run i386.exp under -DDEBUG and found some fail.
>
> This patch aims to fix that. Ok for trunk?
OK.
>
> Thx,
> Haochen
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/adx-check.h: Include
On Sat, Jan 20, 2024 at 10:30 PM H.J. Lu wrote:
>
> When an interrupt handler is implemented by an assembly stub which does:
>
> 1. Save all registers.
> 2. Call a C function.
> 3. Restore all registers.
> 4. Return from interrupt.
>
> it is completely unnecessary to save and restore any
On Wed, Jan 10, 2024 at 12:47 AM H.J. Lu wrote:
>
> When -fsanitize=hwaddress is used, libhwasan will try to enable LAM_U57
> in the startup code. Update the target check to enable hwaddress tests
> if LAM_U57 is enabled. Also compile hwaddress tests with -mlam=u57 on
> x86-64 since hwasan
On Wed, Jan 17, 2024 at 5:59 AM Roger Sayle wrote:
>
>
> I thought I'd just missed the bug fixing season of stage3, but there
> appears to a little latitude in early stage4 (for vector patches), so
> I'll post this now.
>
> This patch resolves PR target/106060 by providing efficient methods for
>
On Thu, Jan 11, 2024 at 12:06 AM H.J. Lu wrote:
>
> On Tue, Jan 9, 2024 at 6:02 PM liuhongt wrote:
> >
> > After r14-2692-g1c6231c05bdcca, the option is defined as EnumSet and
> > -fcf-protection=branch won't unset any others bits since they're in
> > different groups. So to override
On Fri, Jan 12, 2024 at 10:55 AM Jiang, Haochen wrote:
>
> > -Original Message-
> > From: Richard Biener
> > Sent: Thursday, January 11, 2024 4:19 PM
> > To: Liu, Hongtao
> > Cc: Jiang, Haochen ; gcc-patches@gcc.gnu.org;
> > ubiz...@gmail.com; bur...@net-b.de; san...@codesourcery.com
>
On Thu, Jan 11, 2024 at 7:06 AM Andi Kleen wrote:
>
> Hongtao Liu writes:
> >>
> >> +@opindex mapx-inline-asm-use-gpr32
> >> +@item -mapx-inline-asm-use-gpr32
> >> +When APX_F enabled, EGPR usage was by default disabled to prevent
> >> +u
On Tue, Jan 9, 2024 at 3:09 PM Hongyu Wang wrote:
>
> Hi,
>
> For APX, the inline asm behavior was not mentioned in any document
> before. Add description for it.
>
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * config/i386/i386.opt: Adjust document.
> * doc/invoke.texi: Add description
On Mon, Jan 8, 2024 at 11:09 AM Hongyu Wang wrote:
>
> Hi,
>
> The supported sub-features for APX was missing in option document and
> target attribute section. Add those missing ones.
>
> Ok for trunk?
Ok.
>
> gcc/ChangeLog:
>
> * config/i386/i386.opt: Add supported sub-features.
>
On Thu, Dec 14, 2023 at 12:03 AM Jan Hubicka wrote:
>
> > > The diffrerence is that Cores understand the fact that fmadd does not need
> > > all three parameters to start computation, while Zen cores doesn't.
> > >
> > > Since this seems noticeable win on zen and not loss on Core it seems like
>
.c: Likewise.
> * gcc.target/i386/pr100865-5b.c: Likewise.
> * gcc.target/i386/pr100865-9a.c: Likewise.
> * gcc.target/i386/pr100865-9b.c: Likewise.
> * gcc.target/i386/pr102021.c: Likewise.
> * gcc.target/i386/pr90773-17.c: Likewise.
>
> Thanks in a
On Fri, Dec 22, 2023 at 6:25 PM Roger Sayle wrote:
>
>
> This patch resolves the second part of PR target/112992, building upon
> Hongtao Liu's solution to the first part.
>
> The issue addressed by this patch is that when initializing vectors by
> broadcasting integer constants, the compiler has
On Fri, Dec 15, 2023 at 10:34 AM Haochen Jiang wrote:
>
> Hi all,
>
> There is a recent change in AVX10 documentation which allows 64 bit mask
> register instructions in AVX10-256, the documentation comes following:
>
> Intel Advanced Vector Extensions 10 (Intel AVX10) Architecture Specification
On Thu, Dec 14, 2023 at 3:54 PM Hongyu Wang wrote:
>
> Hi,
>
> Currently move_max follows the tuning feature first, but ideally it
> should sync with prefer-vector-width when it is explicitly set to keep
> vector move and operation with same vector size.
>
> Bootstrapped/regtested on
On Thu, Dec 14, 2023 at 10:55 AM Haochen Jiang wrote:
>
> Hi all,
>
> According to ISE050 published at the end of September, RAO-INT will not
> be in Grand Ridge anymore. This patch aims to remove it.
>
> The documentation comes following:
>
> https://cdrdv2.intel.com/v1/dl/getContent/671368
>
>
On Wed, Dec 13, 2023 at 7:59 PM Jakub Jelinek wrote:
>
> On Fri, Dec 08, 2023 at 03:12:00PM +0800, liuhongt wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ready push to trunk.
> >
> > gcc/ChangeLog:
> >
> > PR target/112904
> > * config/i386/mmx.md
On Wed, Dec 13, 2023 at 4:44 PM Jakub Jelinek wrote:
>
> Hi!
>
> The following patch fixes ICE on the testcase in similar way to how
> other folded builtins are handled in ix86_gimple_fold_builtin when
> they don't have a lhs; these builtins are const or pure, so normally
> DCE would remove them
On Tue, Dec 12, 2023 at 10:38 PM Jan Hubicka wrote:
>
> Hi,
> this patch disables use of FMA in matrix multiplication loop for generic (for
> x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U.
>
> For Intel this is neutral both on the matrix multiplication microbenchmark
>
On Tue, Dec 12, 2023 at 1:47 PM Jiang, Haochen via Gcc-regression
wrote:
>
> > -Original Message-
> > From: Jiang, Haochen
> > Sent: Tuesday, December 12, 2023 9:11 AM
> > To: Andrew Pinski (QUIC) ; haochen.jiang
> > ; gcc-regress...@gcc.gnu.org; gcc-
> > patc...@gcc.gnu.org
> > Subject:
On Fri, Dec 8, 2023 at 10:17 AM liuhongt wrote:
>
> If the function desn't clobber any sse registers or only clobber
> 128-bit part, then vzeroupper isn't issued before the function exit.
> the status not CLEAN but ANY after the function.
>
> Also for sibling_call, it's safe to issue an
On Mon, Dec 11, 2023 at 8:39 PM Hongyu Wang wrote:
>
> > > +__int128 u128_2 = (9223372036854775808 << 4) * foo0_u8_0; /* {
> > > dg-warning "integer constant is so large that it is unsigned" "so large"
> > > } */
> >
> > Just you can use (9223372036854775807LL + (__int128) 1) instead of
>
On Mon, Dec 11, 2023 at 4:14 PM Richard Biener
wrote:
>
> On Mon, Dec 11, 2023 at 7:51 AM liuhongt wrote:
> >
> > > since you are looking at TYPE_PRECISION below you want
> > > VECTOR_INTIEGER_TYPE_P here as well? The alternative
> > > would be to compare TYPE_SIZE.
> > >
> > > Some of the
On Wed, Dec 6, 2023 at 3:52 PM Richard Biener
wrote:
>
> On Wed, Dec 6, 2023 at 3:33 AM Jiang, Haochen wrote:
> >
> > > -Original Message-
> > > From: Jiang, Haochen
> > > Sent: Friday, December 1, 2023 4:51 PM
> > > To: Richard Biener
> > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ;
>
ping.
On Thu, Nov 16, 2023 at 6:49 PM liuhongt wrote:
>
> Update in V2:
> 1) Add some comments before the pattern.
> 2) Remove ? from view_convert.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> When I'm working on PR112443, I notice there's some
On Wed, Dec 6, 2023 at 8:11 PM Uros Bizjak wrote:
>
> On Wed, Dec 6, 2023 at 9:08 AM Hongyu Wang wrote:
> >
> > Hi,
> >
> > Following up the discussion of V2 patches in
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639368.html,
> > this patch series add early clobber for all TImode
On Mon, Dec 4, 2023 at 10:10 PM Richard Biener
wrote:
>
> On Mon, Dec 4, 2023 at 6:32 AM liuhongt wrote:
> >
> > .i.e. for below cases.
> >a[0] = b1;
> >a[1] = b2;
> >..
> >a[n] = bn;
> >
> > There're extra dependences when contructing the vector, but not for
> > scalar store.
On Wed, Dec 6, 2023 at 6:23 AM Jakub Jelinek wrote:
>
> Hi!
>
> Regardless of the outcome of the REG_UNUSED discussions, I think
> it is a good idea to move the vzeroupper pass one pass later.
> As can be seen in the multiple PRs and as postreload.cc documents,
> reload/LRA is known to create
On Mon, Dec 4, 2023 at 3:51 PM Uros Bizjak wrote:
>
> On Mon, Dec 4, 2023 at 8:11 AM Hongtao Liu wrote:
> >
> > On Fri, Dec 1, 2023 at 10:26 PM Richard Biener
> > wrote:
> > >
> > > On Fri, Dec 1, 2023 at 3:39 AM liuhongt wrote:
> > >
On Tue, Dec 5, 2023 at 10:32 AM Hongyu Wang wrote:
>
> Hi,
>
> APX NDD patches have been posted at
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636604.html
>
> Thanks to Hongtao's review, the V2 patch adds support of zext sematic with
> memory input as NDD by default clear upper bits
On Fri, Dec 1, 2023 at 10:26 PM Richard Biener
wrote:
>
> On Fri, Dec 1, 2023 at 3:39 AM liuhongt wrote:
> >
> > > Hmm, I would suggest you put reg_needed into the class and accumulate
> > > over all vec_construct, with your patch you pessimize a single v32qi
> > > over two separate v16qi for
Any comments?
On Wed, Nov 22, 2023 at 12:17 PM liuhongt wrote:
>
> From: "Zhang, Annita"
>
> Avoid_fma_chain was enabled in m_SAPPHIRERAPIDS, m_ALDERLAKE and
> m_CORE_HYBRID. It can also be enabled in m_GENERIC to improve the
> performance of -march=x86-64-v3/v4 with -mtune=generic set by
>
On Wed, Nov 29, 2023 at 3:47 PM Richard Biener
wrote:
>
> On Tue, Nov 28, 2023 at 8:54 AM liuhongt wrote:
> >
> > For vec_contruct, the components must be live at the same time if
> > they're not loaded from memory, when the number of those components
> > exceeds available registers, spill
On Wed, Nov 29, 2023 at 9:23 AM Hu, Lin1 wrote:
>
> Hi, all
>
> This patch aims to fix the wrong CPUID of USER_MSR, its correct CPUID is
> (0x7, 0x1).EDX[15], But I set it as (0x7, 0x0).EDX[15]. And the patch modefied
> testcase for give the user a better example.
>
> It has been bootstrapped and
On Tue, Nov 28, 2023 at 9:51 PM Hongyu Wang wrote:
>
> Hi,
>
> On linux x86-64, -fomit-frame-pointer was by default enabled so the
> push2pop2 tests cfi scans are based on it. On other target with
> -fno-omit-frame-pointer the cfi scan will be wrong as the frame pointer
> is pushed at first. Add
On Thu, Nov 23, 2023 at 2:10 PM Haochen Jiang wrote:
>
> Hi all,
>
> This patch should be able to fix the current issue mentioned in PR112643.
>
> Also, I fixed some legacy issues in code related to AVX512/AVX10.
>
> Ok for trunk?
Ok
>
> Thx,
> Haochen
>
> gcc/ChangeLog:
>
> PR
On Wed, Nov 22, 2023 at 11:31 AM Hongyu Wang wrote:
>
> Hi,
>
> The push2/pop2 operand order does not match the binutils implementation
> for AT syntax that it will first push operands[2] then operands[1].
> Correct it by reverse operand order for AT syntax.
>
> Bootstrapped/regtested on
.
>
> Yes, such change also worked and no cfa adjustment required then,
> thanks for the suggestion.
> Updated patch with just 1 new UNSPEC and removed cfa handling.
LGTM.
>
> Hongtao Liu 于2023年11月20日周一 14:46写道:
> >
> > On Fri, Nov 17, 2023 at 3:26 PM Hongyu Wang wrote:
On Fri, Nov 17, 2023 at 3:26 PM Hongyu Wang wrote:
>
> Intel APX PPX feature has been released in [1].
>
> PPX stands for Push-Pop Acceleration. PUSH/PUSH2 and its corresponding POP
> can be marked with a 1-bit hint to indicate that the POP reads the
> value written by the PUSH from the stack.
On Fri, Nov 10, 2023 at 9:42 AM Haochen Jiang wrote:
>
> gcc/ChangeLog:
>
> * common/config/i386/cpuinfo.h (get_available_features):
> Add avx10_set and version and detect avx10.1.
> (cpu_indicator_init): Handle avx10.1-512.
> * common/config/i386/i386-common.cc
>
On Wed, Nov 15, 2023 at 5:43 PM Hongyu Wang wrote:
>
> Hi,
>
> For vextract/insert{if}128 they cannot adopt EGPR in their memory operand, all
> related pattern should be adjusted to disable EGPR usage on them.
> Also fix a wrong gpr16 attr for insertps.
>
> Bootstrapped/regtested on
On Tue, Nov 14, 2023 at 5:01 PM Lehua Ding wrote:
>
> Hi,
>
> This little patch adjust the assert in apx-spill_to_egprs-1.c testcase.
> The -mapxf compilation option allows more registers to be used, which in
> turn eliminates the need for local variables to be stored in stack memory.
>
On Mon, Nov 13, 2023 at 7:25 PM Richard Biener
wrote:
>
> On Mon, Nov 13, 2023 at 7:58 AM Hongtao Liu wrote:
> >
> > On Fri, Nov 10, 2023 at 6:15 PM Richard Biener
> > wrote:
> > >
> > > On Fri, Nov 10, 2023 at 2:42 AM Haochen J
On Mon, Nov 13, 2023 at 4:45 PM Jakub Jelinek wrote:
>
> On Mon, Nov 13, 2023 at 02:27:35PM +0800, Hongtao Liu wrote:
> > > 1) if it isn't better to use separate alternative instead of
> > >x86_evex_reg_mentioned_p, like in the patch below
> > vblendps doesn't
On Fri, Nov 10, 2023 at 5:12 PM Richard Biener
wrote:
>
> On Wed, Nov 8, 2023 at 9:22 AM Hongtao Liu wrote:
> >
> > On Wed, Nov 8, 2023 at 3:53 PM Richard Biener
> > wrote:
> > >
> > > On Wed, Nov 8, 2023 at 2:18 AM Hongtao Liu wrote:
> > >
On Fri, Nov 10, 2023 at 2:14 PM liuhongt wrote:
>
> When I'm working on PR112443, I notice there's some misoptimizations:
> after we fold _mm{,256}_blendv_epi8/pd/ps into gimple, the backend
> fails to combine it back to v{,p}blendv{v,ps,pd} since the pattern is
> too complicated, so I think
On Fri, Nov 10, 2023 at 6:15 PM Richard Biener
wrote:
>
> On Fri, Nov 10, 2023 at 2:42 AM Haochen Jiang wrote:
> >
> > Hi all,
> >
> > This RFC patch aims to add AVX10.1 options. After we added -m[no-]evex512
> > support, it makes a lot easier to add them comparing to the August version.
> >
On Sat, Nov 11, 2023 at 4:11 AM Jakub Jelinek wrote:
>
> On Thu, Nov 09, 2023 at 03:27:11PM +0800, Hongtao Liu wrote:
> > On Thu, Nov 9, 2023 at 3:15 PM Hu, Lin1 wrote:
> > >
> > > This patch aims to avoid generate vblendps with ymm16+, And have
> > > boo
On Fri, Nov 10, 2023 at 10:11 AM Andrew Pinski wrote:
>
> On Thu, Nov 9, 2023 at 5:52 PM liuhongt wrote:
> >
> > When I'm working on PR112443, I notice there's some misoptimizations: after
> > we
> > fold _mm{,256}_blendv_epi8/pd/ps into gimple, the backend fails to combine
> > it
> > back to
On Thu, Nov 9, 2023 at 3:15 PM Hu, Lin1 wrote:
>
> This patch aims to avoid generate vblendps with ymm16+, And have
> bootstrapped and tested on x86_64-pc-linux-gnu{-m32,-m64}. Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/112435
> * config/i386/sse.md: Adding constraints to
On Wed, Nov 8, 2023 at 3:53 PM Richard Biener
wrote:
>
> On Wed, Nov 8, 2023 at 2:18 AM Hongtao Liu wrote:
> >
> > On Tue, Nov 7, 2023 at 10:34 PM Richard Biener
> > wrote:
> > >
> > > On Tue, Nov 7, 2023 at 2:03 PM Hongtao Liu wrote:
> > &g
On Tue, Nov 7, 2023 at 3:33 PM Hongyu Wang wrote:
>
> Hi,
>
> When APX EGPR enabled, the TImode move pattern *movti_internal allows
> move between gpr and sse reg using constraint pair ("r","Yd"). Then a
> post-reload splitter transform such move to vec_extractv2di, while under
> -msse4.1
On Tue, Nov 7, 2023 at 10:34 PM Richard Biener
wrote:
>
> On Tue, Nov 7, 2023 at 2:03 PM Hongtao Liu wrote:
> >
> > On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
> > wrote:
> > >
> > > On Tue, Nov 7, 2023 at 7:08 AM liuhongt wrote:
> > >
On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
wrote:
>
> On Tue, Nov 7, 2023 at 7:08 AM liuhongt wrote:
> >
> > analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> > loop invariant which is not the case when it's INTEGER_CST.
> >
> > Bootstrapped and regtseted on
On Tue, Nov 7, 2023 at 10:27 AM Haochen Jiang wrote:
>
> Hi all,
>
> This patch aims fo fix the wrong isa attribute which caused regression
> on PR111907.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?
>
> Thx,
> Haochen
>
> gcc/ChangeLog:
>
> PR target/111907
> *
On Mon, Nov 6, 2023 at 7:10 PM Jan Beulich wrote:
>
> On 25.06.2023 08:41, Hongtao Liu wrote:
> > On Sun, Jun 25, 2023 at 2:35 PM Hongtao Liu wrote:
> >>
> >> On Sun, Jun 25, 2023 at 2:25 PM Jan Beulich wrote:
> >>>
> >>> On 25.06.2023 07:1
On Fri, Nov 3, 2023 at 6:34 PM Uros Bizjak wrote:
>
> The patch generalizes address register class handling to allow multiple
> address register classes. For APX EGPR targets, some instructions can't be
> encoded with REX2 prefix, so it is necessary to limit address register
> class to avoid
On Tue, Oct 31, 2023 at 2:39 PM Haochen Jiang wrote:
>
> Hi all,
>
> These four patches are going to fix no-evex512 function attribute. The detail
> of the issue comes following:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111889
>
> My proposal for this problem is to also push "no-evex512"
On Mon, Oct 30, 2023 at 3:47 PM Haochen Jiang wrote:
>
> Hi all,
>
> This patch fixed two obvious bug in current evex512 implementation.
>
> Also, I moved AVX512CD+AVX512VL part out of the AVX512VL to avoid
> accidental handle miss in avx512cd in the future.
>
> Ok for trunk?
Ok.
>
> BRs,
>
On Fri, Oct 27, 2023 at 3:21 PM Hongtao Liu wrote:
>
> On Fri, Oct 27, 2023 at 2:49 PM Richard Biener
> wrote:
> >
> >
> >
> > > Am 27.10.2023 um 07:50 schrieb liuhongt :
> > >
> > > When 2 vectors are equal, kmask is allones
On Fri, Oct 27, 2023 at 2:49 PM Richard Biener
wrote:
>
>
>
> > Am 27.10.2023 um 07:50 schrieb liuhongt :
> >
> > When 2 vectors are equal, kmask is allones and kortest will set CF,
> > else CF will be cleared.
> >
> > So CF bit can be used to check for the result of the comparison.
> >
> >
On Tue, Oct 24, 2023 at 6:10 PM Richard Sandiford
wrote:
>
> The files changed in this patch had tests for masked and unmasked
> popcnt. However, the mask inputs to the masked forms were undefined,
> and would be set to zero by init_regs. Any combine-like pass that
> ran after init_regs could
On Tue, Oct 24, 2023 at 1:23 PM Hongtao Liu wrote:
>
> On Tue, Oct 24, 2023 at 10:53 AM Hongtao Liu wrote:
> >
> > On Mon, Oct 23, 2023 at 8:35 PM Richard Biener
> > wrote:
> > >
> > > On Mon, Oct 23, 2023 at 10:48 AM liuhongt wrote:
> > >
On Tue, Oct 24, 2023 at 10:53 AM Hongtao Liu wrote:
>
> On Mon, Oct 23, 2023 at 8:35 PM Richard Biener
> wrote:
> >
> > On Mon, Oct 23, 2023 at 10:48 AM liuhongt wrote:
> > >
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > >
On Mon, Oct 23, 2023 at 8:35 PM Richard Biener
wrote:
>
> On Mon, Oct 23, 2023 at 10:48 AM liuhongt wrote:
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ready push to trunk.
>
> vcond and vcondeq shouldn't be necessary if there's
> vcond_mask and vcmp support which is the
On Wed, Oct 18, 2023 at 4:10 PM Haochen Jiang wrote:
>
> Hi all,
>
> I just found that since ISAs enabled on Sierra Forest changed, clients since
> Arrow Lake will wrongly enable ENQCMD according to the current code.
>
> To avoid messing up again in the future, I changed the dependency on how
On Wed, Oct 18, 2023 at 4:33 PM liuhongt wrote:
>
Cut from subject...
There's a loop in vect_peel_nonlinear_iv_init to get init_expr * pow
(step_expr, skip_niters). When skipn_iters is too big, compile time
hogs. To avoid that, optimize init_expr * pow (step_expr, skip_niters)
to init_expr <<
On Mon, Oct 16, 2023 at 2:25 PM Haochen Jiang wrote:
>
> Hi all,
>
> The patches aim to add new cpu archs Clear Water Forest and
> Panther Lake. Here comes the documentation:
>
> https://cdrdv2.intel.com/v1/dl/getContent/671368
>
> Also in the patches, I refactored how we detect cpu according to
On Thu, Jul 6, 2023 at 1:53 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Thu, Jul 6, 2023 at 3:14 AM liuhongt wrote:
> >
> > For testcase
> >
> > void __cond_swap(double* __x, double* __y) {
> > bool __r = (*__x < *__y);
> > auto __tmp = __r ? *__x : *__y;
> > *__y = __r ? *__y : *__x;
> >
On Tue, Oct 10, 2023 at 2:51 PM Hongyu Wang wrote:
>
> From: "Mo, Zewei"
>
> Hi,
>
> Intel APX PUSH2POP2 feature has been released in [1].
>
> This feature requires stack to be aligned at 16byte, therefore in
> prologue/epilogue, a standalone push/pop will be emitted before any
> push2/pop2 if
On Mon, Oct 9, 2023 at 10:05 AM Hongyu Wang wrote:
>
> For vec_concatv2di, m constraint in alternative 0 and 1 could result in
> egpr allocated on operand 2 under -mapxf. Should use jm instead.
>
> Bootstrapped/regtested on x86-64-linux-gnu.
>
> Ok for trunk?
Ok.
>
> gcc/ChangeLog:
>
> *
> (apx_egpr): Likewise.
> (apx_push2pop2): Likewise.
> (apx_ndd): Likewise.
> (apx_all): Likewise.
> * doc/invoke.texi: Document mapxf.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/apx-1.c: New test.
>
> Co-aut
1 - 100 of 1161 matches
Mail list logo