[Bug c++/86619] Missed optimization opportunity with array aliasing

2021-12-02 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86619

--- Comment #5 from Andrew Pinski  ---
I Noticed clang, ICC nor MSVC either handle this either.

Note GCC is the only one which handles :

int f(std::array & a, std::array & b)
{
  a[0] = 1;
  b[0] = 1;
  return a[0];
}

[Bug c++/86619] Missed optimization opportunity with array aliasing

2018-07-23 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86619

--- Comment #4 from Michael Veksler  ---
It is interesting to check the impact on numerical C++ benchmarks.

Fortran has a conceptual restrict on all its parameter arrays, 
since aliasing is not allowed.

void f(int * __restrict__ v1, int * __restrict__ v2, int n)
{
for (int i=0 ; i < n ; i++)
v1[0] += v2[i];
}
and Fortran:
  subroutine f(v1, v2, n)
  integer :: v1(100)
  integer :: v2(100)
  integer :: n

  DO i=1, n
v1(1) = v1(1) + v2(i)
  END DO
  end subroutine f

Generate the same loop:
.L3:
addl(%rdx), %eax
addq$4, %rdx
cmpq%rdx, %r8
jne .L3


But without restrict, as expected, g++ generates:
.L8:
addl(%rdx), %eax
addq$4, %rdx
cmpq%r8, %rdx
movl%eax, (%rcx)
jne .L8

Running both variants from a loop (in a separate translation unit, 
without whole program optimization) (g++ 7.2.0 with -O2 on 64 bit cygwin):
#include 
#include 
void f(int * __restrict__ v1, int *__restrict__  v2, int SIZE);
void g(int * v1, int * v2, int SIZE);
constexpr int SIZE = 1'000'000;
int v2[SIZE];
int main()
{
int v1;

f(, v2, SIZE); // Warm up cache

auto start = std::clock();
constexpr int TIMES = 10'000;
for (int i=0 ; i < TIMES; ++i) {
v1 = 0;
f(, v2, SIZE);
}

auto t1 = std::clock();
for (int i=0 ; i < TIMES; ++i) {
v1 = 0;
g(, v2, SIZE);
}

auto t2 = std::clock();
std::cout << "with restrict: "
   << double(t1 - start) / CLOCKS_PER_SEC << " sec\n";
std::cout << "without restrict: "
   << double(t2 - t1) / CLOCKS_PER_SEC  << " sec\n";
}

And the results are:
  with restrict: 4.477 sec
  without restrict: 5.756 sec
Which clearly demonstrates the impact of good alias analysis.

With plain C pointers, this is an unavoidable price.
But unfortunately this also happens when passing pointers or 
references to arrays of different sizes, or when inheriting two 
different types from std::array, in order to mark the parameters
as non-aliasing.

[Bug c++/86619] Missed optimization opportunity with array aliasing

2018-07-23 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86619

--- Comment #3 from rguenther at suse dot de  ---
On Mon, 23 Jul 2018, mickey.veksler at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86619
> 
> --- Comment #2 from Michael Veksler  ---
> >> type-based alias analysis doesn't distinguish between int[2] and int[3]. 
> 
> Is it just the way GCC implements type-based alias analysis, 
> or is it defined that way in the C and C++ standards?

It's the way GCC implements it.

> I suspect  that the weaker alias analysis of arrays (int [size] and
> std::array) is one of the things that make C++ slower than 
> Fortran on some benchmarks.

Not sure - Fortran shares the restriction and also uses pointer-based
accesses.  Fortran is just more constrained so it can put __restrict
on its arrays as an implementation detail very aggressively.

[Bug c++/86619] Missed optimization opportunity with array aliasing

2018-07-23 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86619

--- Comment #2 from Michael Veksler  ---
>> type-based alias analysis doesn't distinguish between int[2] and int[3]. 

Is it just the way GCC implements type-based alias analysis, 
or is it defined that way in the C and C++ standards?

I suspect  that the weaker alias analysis of arrays (int [size] and
std::array) is one of the things that make C++ slower than 
Fortran on some benchmarks.

[Bug c++/86619] Missed optimization opportunity with array aliasing

2018-07-23 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86619

Richard Biener  changed:

   What|Removed |Added

   Keywords||alias, missed-optimization
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-07-23
 CC||rguenth at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
type-based alias analysis doesn't distinguish between int[2] and int[3].  The
issue with operator[] is that the FE produces


;; Function T& ar::operator[](size_t) [with T = int; long unsigned int
size = 2; size_t = long unsigned int] (null)
;; enabled by -tree-original


return  = (int &) &((struct ar *) this)->ar[offset];

and

<::operator[] ((struct ar *) a, 0) = 1) >;
<::operator[] ((struct ar *) b, 0) = 2) >;
< = *ar::operator[] ((struct ar *) a,
0)>>;

which after inlining is

  int & _6;
  int & _9;

   :
  _6 = _5(D)->ar[0];
  *_6 = 1;
  _9 = _8(D)->ar[0];
  *_9 = 2;
  _11 = _5(D)->ar[0];
  _12 = *_11;
  return _12;

compared to

f1 (struct ar & a, struct ar & b)
{
  int _6;

   :
  a_2(D)->ar[0] = 1;
  b_4(D)->ar[0] = 2;
  _6 = a_2(D)->ar[0];
  return _6;

here TBAA only sees int & accesses which do conflict and points-to analysis
is TBAA agnostic and cannot disambiguate a_5(D) and b_8(D).  For f1
TBAA sees structure accesses and can disambiguate.

C++ abstraction makes it harder to optimize here.  You get two accesses
of effective type int vs. one of ar and one of ar.


Way in the past points-to had some bits of TBAA, eventually we can
re-introduce bits here but the TBAA bits did not play well with
the points-to solver and created wrong-code.


Note there isn't really a way to tell the middle-end that a pointed
to object is of a specific dynamic type.  Eventually we can play
leeway and make REFERENCE_TYPE parameters behave that way.