On Thu, Oct 5, 2023 at 1:45 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > Doh! ENOPATCH. > > > -----Original Message----- > > From: Roger Sayle <ro...@nextmovesoftware.com> > > Sent: 05 October 2023 12:44 > > To: 'gcc-patches@gcc.gnu.org' <gcc-patches@gcc.gnu.org> > > Cc: 'Uros Bizjak' <ubiz...@gmail.com> > > Subject: [X86 PATCH] Implement doubleword shift left by 1 bit using > add+adc. > > > > > > This patch tweaks the i386 back-end's ix86_split_ashl to implement > doubleword > > left shifts by 1 bit, using an add followed by an add-with-carry (i.e. a > doubleword > > x+x) instead of using the x86's shld instruction. > > The replacement sequence both requires fewer bytes and is faster on both > Intel > > and AMD architectures (from Agner Fog's latency tables and confirmed by my > > own microbenchmarking). > > > > For the test case: > > __int128 foo(__int128 x) { return x << 1; } > > > > with -O2 we previously generated: > > > > foo: movq %rdi, %rax > > movq %rsi, %rdx > > shldq $1, %rdi, %rdx > > addq %rdi, %rax > > ret > > > > with this patch we now generate: > > > > foo: movq %rdi, %rax > > movq %rsi, %rdx > > addq %rdi, %rax > > adcq %rsi, %rdx > > ret > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and > > make -k check, both with and without --target_board=unix{-m32} with no new > > failures. Ok for mainline? > > > > > > 2023-10-05 Roger Sayle <ro...@nextmovesoftware.com> > > > > gcc/ChangeLog > > * config/i386/i386-expand.cc (ix86_split_ashl): Split shifts by > > one into add3_cc_overflow_1 followed by add3_carry. > > * config/i386/i386.md (@add<mode>3_cc_overflow_1): Renamed from > > "*add<mode>3_cc_overflow_1" to provide generator function. > > > > gcc/testsuite/ChangeLog > > * gcc.target/i386/ashldi3-2.c: New 32-bit test case. > > * gcc.target/i386/ashlti3-3.c: New 64-bit test case.
OK. Thanks, Uros. > > > > > > Thanks in advance, > > Roger > > -- >