[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2023-07-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

--- Comment #12 from Andrew Pinski  ---
Note the original example in comment #0 is now optimized for GCC 14 but only at
the RTL level rather than the gimple level.

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2023-07-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

--- Comment #11 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:6d449531a60b56ed0f4aeb640aa9e46e4ec35208

commit r14-2698-g6d449531a60b56ed0f4aeb640aa9e46e4ec35208
Author: Andrew Pinski 
Date:   Thu Jul 20 17:36:29 2023 -0700

MATCH: Add Max,a> -> Max simplifcation

This adds a simple match pattern to simplify
`max,a>` to `max`.  Reassociation handles
this already (r0-77700-ge969dbde29bfd396259357) but
seems like we should be able to handle this even before
reassociation.

This fixes part of PR tree-optimization/80574 but more
work is needed fix it the rest of the way. The original
testcase there is fixed but the RTL level is what fixes
it the rest of the way.

OK? Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* match.pd (minmax,a>->minmax): New
transformation.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/reassoc-12.c: Disable all of
the passes that enables match-and-simplify.
* gcc.dg/tree-ssa/minmax-23.c: New test.

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

--- Comment #10 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #9)
> One thing I noticed is that:
>   _2 = MAX_EXPR <_6, a3_7(D)>;
>   _3 = MAX_EXPR <_2, a3_7(D)>;
> 
> Is not optimized at all.
> 
> (for minmax (min max)
>  (simplify
>   (minmax:c (minmax:c@2 @0 @1) @0)
>   @2))

Submitted the patch for that as
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625135.html .
Note after that patch we get decent code for the original testcases but it is
not fully optimized at the gimple level.

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2023-07-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

--- Comment #9 from Andrew Pinski  ---
One thing I noticed is that:
  _2 = MAX_EXPR <_6, a3_7(D)>;
  _3 = MAX_EXPR <_2, a3_7(D)>;

Is not optimized at all.

(for minmax (min max)
 (simplify
  (minmax:c (minmax:c@2 @0 @1) @0)
  @2))

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2023-05-01 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|8.0 |---

--- Comment #8 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #7)
> The original testcase in comment #0 is fixed in GCC 8, I don't know what
> caused the improvement though.

Well actually if you use the C++ front-end, it still fails.

for f2_signed, we start out as:
  _1 = MAX_EXPR ;
  if (_1 >= a1_6(D))
goto ; [INV]
  else
goto ; [INV]

   :
  if (a3_4(D) < a2_5(D))
goto ; [INV]
  else
goto ; [INV]

   :

   :
  # iftmp.5_2 = PHI 
  return iftmp.5_2;

phiopt1 transforms it to:
  _1 = MAX_EXPR ;
  if (_1 >= a1_6(D))
goto ; [INV]
  else
goto ; [INV]

   :
  _3 = MAX_EXPR ;

   :
  # iftmp.12_2 = PHI <_3(3), a1_6(D)(2)>

Which is perfect.
But then we don't exactly patch that _1 and _3 are the same though we do try to
simplify it at least on the trunk:
phiopt match-simplify trying:
_1 >= a1_6(D) ? _3 : a1_6(D)

phiopt match-simplify trying:
_1 < a1_6(D) ? a1_6(D) : _3

What happens afterwards is fre (or is it pre) figures out _1 and _3 are the
same and get:
  if (_1 >= a1_6(D))
goto ; [INV]
  else
goto ; [INV]

   :

   :
  # iftmp.12_2 = PHI <_1(3), a1_6(D)(2)>

Which then phiopt2 is able to simplify.
So if we iterate phiopt and fre we should able to handle all of these but that
is NOT a reasonable solution.
I have to think of a good way of solving these really.

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2021-07-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2021-07-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.0

--- Comment #7 from Andrew Pinski  ---
The original testcase in comment #0 is fixed in GCC 8, I don't know what caused
the improvement though.

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2017-05-04 Thread SztfG at yandex dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

SztfG at yandex dot ru changed:

   What|Removed |Added

 CC||SztfG at yandex dot ru

--- Comment #6 from SztfG at yandex dot ru ---
Created attachment 41316
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41316=edit
some benchmark with macro stuff and std::max

Well, maybe this is also not related to this issue, but here is some benchmark,
and std::max is slower than macro

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2017-05-02 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-05-02
 Ever confirmed|0   |1

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2017-05-01 Thread SztfG at yandex dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

--- Comment #5 from SztfG at yandex dot ru ---
> He did not claim it was always better...

Ahh, so I need to do some research to figure out, in which cases static inline
function is better, and in which macro is better. It's bad

> Please don't mix unrelated issues

OK, will fill this in another bugreport

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2017-05-01 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

--- Comment #4 from Marc Glisse  ---
(In reply to SztfG from comment #3)
> Georg-Johann Lay, GCC not always do things better if use static inline
> function instead macro.

He did not claim it was always better...

> For example, this code:

Please don't mix unrelated issues, you can file a different bug if you want
that one addressed. In that case, it is because we have an old optimization in
fold_unary (like other "do ... if ... simplifies" it is not straightforward to
move it to match.pd) and thus only applies when everything is part of the same
expression in the source code.

  /* Convert ~(X ^ Y) to ~X ^ Y or X ^ ~Y if ~X or ~Y simplify.  */

If you use even less macros and more lines, things optimize a bit better:
m_xnor:
  TYPE tmp=!!a^!!b;
  return !tmp;
but we still have some discrepancy where we optimize (a^b)==0 but not ~(a^b)
(for _Bool type). This will be much more convenient to analyze in a different
bug report.

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2017-05-01 Thread SztfG at yandex dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

--- Comment #3 from SztfG at yandex dot ru ---
Georg-Johann Lay, GCC not always do things better if use static inline function
instead macro. For example, this code:

#include 

#define TYPE uint8_t

#define M_XOR(a,b) ((!!a)^(!!b))
#define M_NXOR(a,b) (!((!!a)^(!!b)))

__attribute__((__always_inline__, const))
static inline TYPE m_xor (const TYPE a, const TYPE b)
{
return M_XOR(a,b);
}

__attribute__((__always_inline__, const))
static inline TYPE m_xnor (const TYPE a, const TYPE b)
{
return M_NXOR(a,b);
}

// bad assembly output
int test1b(const TYPE a, const TYPE b)
{
return m_xor(a,b) == !m_xnor(a,b);
}

int test2b(const TYPE a, const TYPE b)
{
return !m_xor(a,b) == m_xnor(a,b);
}

int test3b(const TYPE a, const TYPE b)
{
return M_XOR(a,b) == !m_xnor(a,b);
}

// good assembly output
int test1g(const TYPE a, const TYPE b)
{
return m_xor(a,b) == M_XOR(a,b);
}

int test2g(const TYPE a, const TYPE b)
{
return M_XOR(a,b) == !M_NXOR(a,b);
}

int test3g(const TYPE a, const TYPE b)
{
return M_XOR(a,b) != !M_NXOR(a,b);;
}

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2017-04-30 Thread gjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

Georg-Johann Lay  changed:

   What|Removed |Added

 CC||gjl at gcc dot gnu.org

--- Comment #2 from Georg-Johann Lay  ---
GCC performs poor on code expanded from macros that (recursively) duplicate
macro arguments.  Some time ago I digged into it, and the reason was that it
failed to recognize MIN_EXPR / MAX_EXPR because some optimizations factor out
common subextressions.  Yet another problem is that the expressions inside the
conditions get promoted to int resp. unsigned, whereas the target values remain
types smaller than int.  (Is was actually more complex code that implemented
saturation by nested MIN / MAX expressions.

As a work around, you can try to use inline functions so that GCC will
recognize MAX_EXPR and MIN_EXPR as expected.  The drawback is that you need a
series or macros for each input type like: int8_t, uint8_t, int16_t, ...
(unsigned and signed should be enough thou).

Sample work around for unsigned char:

#define MAX_1(VAR, ...) \
  (VAR)

#define MAX_2(VAR, ...) \
  (((VAR)>MAX_1(__VA_ARGS__))?(VAR):MAX_1(__VA_ARGS__))

__attribute__((__always_inline__))
static inline unsigned char max2 (unsigned char a, unsigned char b)
{
return MAX_2 (a, b);
}

#undef  MAX_2
#define MAX_2(a, b)  max2 (a, b)

#define MAX_3(VAR, ...) \
  (MAX_2 ((VAR), MAX_2(__VA_ARGS__)))

#define MAX_4(VAR, ...) \
  (MAX_2 ((VAR), MAX_3(__VA_ARGS__)))

#define MAX_5(VAR, ...) \
  (MAX_2 ((VAR), MAX_4(__VA_ARGS__)))

#define MAX_6(VAR, ...) \
  (MAX_2 ((VAR), MAX_5(__VA_ARGS__)))


The .original dump as generated with -fdump-tree-original reads now:


;; Function max2 (null)
{
  return MAX_EXPR ;
}

;; Function f1_unsigned (null)
{
  return max2 (a1, max2 (a2, max2 (a3, max2 (a1, max2 (a2, a3);
}

;; Function f2_unsigned (null)
{
  return max2 (a1, max2 (a2, a3));
}

and after inline expansion everything is nice with MAX_EXPR whereas your
original code leads to:


; Function f1_unsigned (null)
{
  return (unsigned char) MAX_EXPR  a3 ?
(int) a2 : (int) a3, (int) a1>, (int) a3>, (int) a2>, (int) a1>;
}

;; Function f2_unsigned (null)
{
  return (unsigned char) MAX_EXPR  a3 ? (int) a2 : (int) a3, (int) a1>;
}


Not all of the expressions are recognized as MAX_EXPR.

[Bug tree-optimization/80574] GCC fail to optimize nested ternary

2017-04-30 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574

--- Comment #1 from Marc Glisse  ---
With -fdump-tree-original, the signed case looks perfect:

  return MAX_EXPR , a1>, a3>,
a2>, a1>;

(which reassoc eventually simplifies)
while in the unsigned case, we fail to recognize the innermost max:

  return (unsigned char) MAX_EXPR  a3 ?
(int) a2 : (int) a3, (int) a1>, (int) a3>, (int) a2>, (int) a1>;

and we also fail during gimple, probably because of the conversions.