http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55213
Bug #: 55213
Summary: vectorizer ignores __restrict__
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: tree-optimization
AssignedTo: [email protected]
ReportedBy: [email protected]
I raised this issue before, still I think that with vectorization becoming more
and more common aliasing starts to become an issue for both code-size and
speed.
for all the loops below the compiler emits alias checks.
My desire would be that foo produces optimal code (possibly with much less
__restrict__ in the code than what I used below),
still even in the others functions __restrict__ is ignored
compiled as
c++ -Ofast -c soa.cc -std=gnu++11 -ftree-vectorizer-verbose=1 -Wall
-march=corei7
with gcc version 4.8.0 20121028 (experimental) [trunk revision 192889] (GCC)
#include<cstdint>
struct Soa {
uint32_t * mem;
uint32_t ns;
uint32_t cp;
int const * __restrict__ i() const __restrict__ { return (int const*
__restrict__)(mem);}
float const * __restrict__ f() const __restrict__ { return (float const*
__restrict__)(mem+cp);}
float const * __restrict__ g() const __restrict__ { return (float const*
__restrict__)(mem+2*cp);}
};
void foo(Soa const & __restrict__ soa, float * __restrict__ res) {
for(std::size_t i=0; i!=soa.ns; ++i)
res[i] = soa.f()[i]+soa.g()[i];
}
void bar(Soa const & __restrict__ soa, float * __restrict__ res) {
float const * __restrict__ f = soa.f(); float const * __restrict__ g =
soa.g();
int n = soa.ns; for(int i=0; i!=n; ++i)
res[i] = f[i]+g[i];
}
inline
void add(float const * __restrict__ f, float const * __restrict__ g,float *
__restrict__ res,int n) {
for(int i=0; i!=n; ++i)
res[i] = f[i]+g[i];
}
void add(Soa const & __restrict__ soa, float * __restrict__ res) {
add(soa.f(),soa.g(),res,soa.ns);
}