Re: [PATCH] Recognize a missed usage of a sbfiz instruction

2018-02-02 Thread Kyrill Tkachov

Hi Luis,


On 02/02/18 14:38, Luis Machado wrote:

A customer reported the following missed opportunities to combine a couple
instructions into a sbfiz.

int sbfiz32 (int x)
{
  return x << 29 >> 10;
}

long sbfiz64 (long x)
{
  return x << 58 >> 20;
}

This gets converted to the following pattern:

(set (reg:SI 98)
(ashift:SI (sign_extend:SI (reg:HI 0 x0 [ xD.3334 ]))
(const_int 6 [0x6])))

Currently, gcc generates the following:

sbfiz32:
lsl x0, x0, 29
asr x0, x0, 10
ret

sbfiz64:
lsl x0, x0, 58
asr x0, x0, 20
ret

It could generate this instead:

sbfiz32:
sbfiz   w0, w0, 19, 3
ret

sbfiz64::
sbfiz   x0, x0, 38, 6
ret

The unsigned versions already generate ubfiz for the same code, so the lack of
a sbfiz pattern may have been an oversight.


You're right. In fact, llvm will generate the SBFIZ form.



This particular sbfiz pattern shows up in both CPU2006 (~ 80 hits) and
CPU2017 (~ 280 hits). It's not a lot, but seems beneficial in any case. No
significant performance differences, probably due to the small number of
occurrences or cases outside hot areas.


Yeah, these little missed opportunities add up in the long run :)



Regression-tested and bootstrapped ok on aarch64-linux. Validated with
CPU2017 and CPU2006 runs.

I thought i'd put this up for review. I know we're still not in development
mode yet.

2018-02-02  Luis Machado  

gcc/
* config/aarch64/aarch64.md (*ashift_extv_bfiz): New pattern.
* testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c: New test.


I'm sure you know this already, the testsuite entry will go into a ChangeLog
of its own in gcc/testsuite/ with the testsuite/ prefix removed.


---
 gcc/config/aarch64/aarch64.md| 13 +
 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c | 24 
 2 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5a2a930..d336bf0 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4828,6 +4828,19 @@
   [(set_attr "type" "bfx")]
 )

+;; Match sbfiz pattern in a shift left + shift right operation.
+
+(define_insn "*ashift_extv_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+   (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" 
"r")
+ (match_operand 2 "const_int_operand" "n")
+ (match_operand 3 "const_int_operand" "n"))
+(match_operand 4 "const_int_operand" "n")))]
+  "UINTVAL (operands[2]) < "
+  "sbfiz\\t%0, %1, %4, %2"
+  [(set_attr "type" "bfx")]
+)


I think you need some more validation on operands 3 and 4.
Look at other similar patterns, but you want something like the 
aarch64_simd_shift_imm_ predicate
on them to make sure they fall within [0, 63] or [0, 31].
Also, for operands 2 you probably want the aarch64_simd_shift_imm_offset_di to 
make sure it doesn't
allow a size of zero. Don't know if the RTL optimisers will try to create such 
wonky RTL, but
we tend to be defensive in the pattern about these things.


+
 ;; When the bit position and width of the equivalent extraction add up to 32
 ;; we can use a W-reg LSL instruction taking advantage of the implicit
 ;; zero-extension of the X-reg.
diff --git a/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c 
b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c
new file mode 100644
index 000..931f8f4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+/* Check that a LSL followed by an ASR can be combined into a single SBFIZ
+   instruction.  */
+
+/* Using W-reg */
+
+int
+sbfiz32 (int x)
+{
+  return x << 29 >> 10;
+}
+
+/* Using X-reg */
+
+long
+sbfiz64 (long x)
+{
+  return x << 58 >> 20;
+}


long will be 32 bits when testing with -mabi=ilp32 so you want this to be long 
long,
or some other guaranteed 64-bit type.

Thanks,
Kyrill


+
+/* { dg-final { scan-assembler "sbfiz\tw" } } */
+/* { dg-final { scan-assembler "sbfiz\tx" } } */
--
2.7.4





[PATCH] Recognize a missed usage of a sbfiz instruction

2018-02-02 Thread Luis Machado
A customer reported the following missed opportunities to combine a couple
instructions into a sbfiz.

int sbfiz32 (int x)
{
  return x << 29 >> 10;
}

long sbfiz64 (long x)
{
  return x << 58 >> 20;
}

This gets converted to the following pattern:

(set (reg:SI 98)
(ashift:SI (sign_extend:SI (reg:HI 0 x0 [ xD.3334 ]))
(const_int 6 [0x6])))

Currently, gcc generates the following:

sbfiz32:
lsl x0, x0, 29
asr x0, x0, 10
ret

sbfiz64:
lsl x0, x0, 58
asr x0, x0, 20
ret

It could generate this instead:

sbfiz32:
sbfiz   w0, w0, 19, 3
ret

sbfiz64::
sbfiz   x0, x0, 38, 6
ret

The unsigned versions already generate ubfiz for the same code, so the lack of
a sbfiz pattern may have been an oversight.

This particular sbfiz pattern shows up in both CPU2006 (~ 80 hits) and
CPU2017 (~ 280 hits). It's not a lot, but seems beneficial in any case. No
significant performance differences, probably due to the small number of
occurrences or cases outside hot areas.

Regression-tested and bootstrapped ok on aarch64-linux. Validated with
CPU2017 and CPU2006 runs.

I thought i'd put this up for review. I know we're still not in development
mode yet.

2018-02-02  Luis Machado  

gcc/
* config/aarch64/aarch64.md (*ashift_extv_bfiz): New pattern.
* testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c: New test.
---
 gcc/config/aarch64/aarch64.md| 13 +
 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c | 24 
 2 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5a2a930..d336bf0 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4828,6 +4828,19 @@
   [(set_attr "type" "bfx")]
 )
 
+;; Match sbfiz pattern in a shift left + shift right operation.
+
+(define_insn "*ashift_extv_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+   (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" 
"r")
+ (match_operand 2 "const_int_operand" "n")
+ (match_operand 3 "const_int_operand" "n"))
+(match_operand 4 "const_int_operand" "n")))]
+  "UINTVAL (operands[2]) < "
+  "sbfiz\\t%0, %1, %4, %2"
+  [(set_attr "type" "bfx")]
+)
+
 ;; When the bit position and width of the equivalent extraction add up to 32
 ;; we can use a W-reg LSL instruction taking advantage of the implicit
 ;; zero-extension of the X-reg.
diff --git a/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c 
b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c
new file mode 100644
index 000..931f8f4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+/* Check that a LSL followed by an ASR can be combined into a single SBFIZ
+   instruction.  */
+
+/* Using W-reg */
+
+int
+sbfiz32 (int x)
+{
+  return x << 29 >> 10;
+}
+
+/* Using X-reg */
+
+long
+sbfiz64 (long x)
+{
+  return x << 58 >> 20;
+}
+
+/* { dg-final { scan-assembler "sbfiz\tw" } } */
+/* { dg-final { scan-assembler "sbfiz\tx" } } */
-- 
2.7.4