On Wed, Sep 7, 2011 at 2:14 AM, Richard Sandiford <richard.sandif...@linaro.org> wrote: > Michael Hope <michael.h...@linaro.org> writes: >> While out benchmarking today, I ran across code similar to this: >> >> int *a; >> int *b; >> int *c; >> >> const int ad[320]; >> const int bd[320]; >> const int cd[320]; >> >> void fill() >> { >> for (int i = 0; i < 320; i++) >> { >> a[i] = ad[i]; >> b[i] = bd[i]; >> c[i] = cd[i]; >> } >> } >> >> I was surprised and happy to see the vectoriser kick in for the copy. >> The inner loop looks like: >> >> add r5, r3, ip >> adds r4, r3, r7 >> vldmia r2!, {d16-d17} >> vldmia r1!, {d18-d19} >> adds r0, r3, r6 >> vst1.32 {q9}, [r5] >> vst1.32 {q8}, [r4] >> vldmia r3, {d16-d17} >> adds r3, r3, #16 >> cmp r3, r8 >> vst1.32 {q8}, [r0] >> bne .L3 >> >> so r3 is the loop variable and {ip,r7} are the offsets from r3 to the >> destination pointers. Adding a __restrict doesn't change the code. > > FWIW, this comes from ivopts. I raised the "problem" on gcc@ > a few months back, but it seems to be intentional behaviour: > > http://gcc.gnu.org/ml/gcc/2011-07/msg00050.html > > That is, all things being equal, the current code tends to prefer > cases where it can hoist the difference between potential ivs > rather than creating separate ivs. > > As far as the end of today's meeting goes: ivopts is one of those > things on my unwritten list of areas that it would be nice to look at. > I posted some benchmark comparing -fivopts with -fno-ivopts to the > benchmark list in July. As expected, ivopts does help a lot cases, > but there were also a fair number of cases where turning it off > significantly improved performance.
Spawned into: https://blueprints.launchpad.net/gcc-linaro/+spec/investigate-ivopts -- Michael _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain