Re: [PATCH 4/4] tcg/optimize: optimize TSTNE using smask and zmask

2024-02-29 Thread Richard Henderson

On 2/28/24 23:35, Paolo Bonzini wrote:

On 2/29/24 00:10, Richard Henderson wrote:

On 2/28/24 01:11, Paolo Bonzini wrote:

-    /* TSTNE x,sign -> LT x,0 */
-    if (arg_is_const_val(*p2, (ctx->type == TCG_TYPE_I32
-   ? INT32_MIN : INT64_MIN))) {
+    /* TSTNE x,i -> LT x,0 if i only includes sign bit copies */
+    if (arg_is_const(*p2) && (arg_info(*p2)->val & ~i1->s_mask) == 0) {


This is a good idea, but s_mask isn't defined like you think -- it is *repetitions* of 
the sign bit, but not including the sign bit itself. For INT64_MIN, s_mask == 0.


So for TSTNE min,min, (min & ~0) != 0, so the test won't pass.


Oh! So I have to squash:

diff --git a/tcg/optimize.c b/tcg/optimize.c
index ab976a5bbe7..44d1b1a6d8a 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -140,6 +140,12 @@ static inline bool arg_is_const_val(TCGArg arg, uint64_t 
val)
  return ts_is_const_val(arg_temp(arg), val);
  }

+/* Calculate all the copies of the sign bit, both redundant and not. */
+static inline uint64_t all_sign_bit_copies(TempOptInfo *info)
+{
+    return (info->s_mask >> 1) | INT64_MIN;
+}


You need to care about type too -- for TCG_TYPE_I32, you'll want to OR in INT32_MIN.  The 
high bits of s_mask will be unknown (might be 1's from fold_masks, might be 0 from reset_ts).


But otherwise that's a good solution.


r~



Re: [PATCH 4/4] tcg/optimize: optimize TSTNE using smask and zmask

2024-02-29 Thread Paolo Bonzini

On 2/29/24 00:10, Richard Henderson wrote:

On 2/28/24 01:11, Paolo Bonzini wrote:

-    /* TSTNE x,sign -> LT x,0 */
-    if (arg_is_const_val(*p2, (ctx->type == TCG_TYPE_I32
-   ? INT32_MIN : INT64_MIN))) {
+    /* TSTNE x,i -> LT x,0 if i only includes sign bit copies */
+    if (arg_is_const(*p2) && (arg_info(*p2)->val & ~i1->s_mask) == 0) {


This is a good idea, but s_mask isn't defined like you think -- it is 
*repetitions* of the sign bit, but not including the sign bit itself.  
For INT64_MIN, s_mask == 0.


So for TSTNE min,min, (min & ~0) != 0, so the test won't pass.


Oh! So I have to squash:

diff --git a/tcg/optimize.c b/tcg/optimize.c
index ab976a5bbe7..44d1b1a6d8a 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -140,6 +140,12 @@ static inline bool arg_is_const_val(TCGArg arg, uint64_t 
val)
 return ts_is_const_val(arg_temp(arg), val);
 }
 
+/* Calculate all the copies of the sign bit, both redundant and not. */

+static inline uint64_t all_sign_bit_copies(TempOptInfo *info)
+{
+return (info->s_mask >> 1) | INT64_MIN;
+}
+
 static inline bool ts_is_copy(TCGTemp *ts)
 {
 return ts_info(ts)->next_copy != ts;
@@ -825,7 +831,7 @@ static int do_constant_folding_cond1(OptContext *ctx, TCGOp 
*op, TCGArg dest,
 }
 
 /* TSTNE x,i -> LT x,0 if i only includes sign bit copies */

-if (arg_is_const(*p2) && (arg_info(*p2)->val & ~i1->s_mask) == 0) {
+if (arg_is_const(*p2) && (arg_info(*p2)->val & ~all_sign_bit_copies(i1)) 
== 0) {
 *p2 = arg_new_constant(ctx, 0);
 *pcond = tcg_tst_ltge_cond(cond);
 return -1;


I tested with

   movq $0x8000, %rbx
   test %ebx, %ebx
   js y

and I get

 brcond_i64 cc_dst,$0x8000,tstne,$L1

which works and matches your explanation:

 i1.s_mask == 0x
 i2.val == 0x8000
 all_sign_bit_copies(i1) == 0x8000
 u2.val & ~all_sign_bit_copies(i1) == 0

Thanks!

Paolo




Re: [PATCH 4/4] tcg/optimize: optimize TSTNE using smask and zmask

2024-02-28 Thread Richard Henderson

On 2/28/24 01:11, Paolo Bonzini wrote:

-/* TSTNE x,sign -> LT x,0 */
-if (arg_is_const_val(*p2, (ctx->type == TCG_TYPE_I32
-   ? INT32_MIN : INT64_MIN))) {
+/* TSTNE x,i -> LT x,0 if i only includes sign bit copies */
+if (arg_is_const(*p2) && (arg_info(*p2)->val & ~i1->s_mask) == 0) {


This is a good idea, but s_mask isn't defined like you think -- it is *repetitions* of the 
sign bit, but not including the sign bit itself.  For INT64_MIN, s_mask == 0.


So for TSTNE min,min, (min & ~0) != 0, so the test won't pass.

r~