Hi Ramana On 15 September 2017 at 18:40, Ramana Radhakrishnan <ramana....@googlemail.com> wrote: > On Fri, Sep 15, 2017 at 2:33 AM, Kugan Vivekanandarajah > <kugan.vivekanandara...@linaro.org> wrote: >> This patch adds aarch64_loop_unroll_adjust to limit partial unrolling >> in rtl based on strided-loads in loop. >> >> Thanks, >> Kugan >> >> gcc/ChangeLog: >> >> 2017-09-12 Kugan Vivekanandarajah <kug...@linaro.org> >> >> * cfgloop.h (iv_analyze_biv): export. >> * loop-iv.c: Likewise. >> * config/aarch64/aarch64.c (strided_load_p): New. >> (insn_has_strided_load): New. >> (count_strided_load_rtl): New. >> (aarch64_loop_unroll_adjust): New. > > > This implementation assumes a particular kind of prefetcher and > collisions in that hardware prefetcher. Are you sure this helps every > single micro-architecture out there (or rather doesn't harm ?) ? > Further more how has this patchset been benchmarked, what > micro-architecture, what benchmarks, what's the performance impact and > why should this be considered for generic ? >
I tested on -mcpu=falkor and at the moment this does not have any effect on other cpus. It is not enabled for generic. Thanks, Kugan