Re: [RFC][AARCH64][PATCH 5/5] add aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop

Kugan Vivekanandarajah Sat, 16 Sep 2017 15:54:59 -0700

Hi Ramana

On 15 September 2017 at 18:40, Ramana Radhakrishnan
<ramana....@googlemail.com> wrote:
> On Fri, Sep 15, 2017 at 2:33 AM, Kugan Vivekanandarajah
> <kugan.vivekanandara...@linaro.org> wrote:
>> This patch adds aarch64_loop_unroll_adjust to limit partial unrolling
>> in rtl based on strided-loads in loop.
>>
>> Thanks,
>> Kugan
>>
>> gcc/ChangeLog:
>>
>> 2017-09-12  Kugan Vivekanandarajah  <kug...@linaro.org>
>>
>>     * cfgloop.h (iv_analyze_biv): export.
>>     * loop-iv.c: Likewise.
>>     * config/aarch64/aarch64.c (strided_load_p): New.
>>     (insn_has_strided_load): New.
>>     (count_strided_load_rtl): New.
>>     (aarch64_loop_unroll_adjust): New.
>
>
> This implementation assumes a particular kind of prefetcher and
> collisions in that hardware prefetcher. Are you sure this helps every
> single micro-architecture out there (or rather doesn't harm ?) ?
> Further more how has this patchset been benchmarked, what
> micro-architecture, what benchmarks, what's the performance impact and
> why should this be considered for generic ?
>


I tested on -mcpu=falkor and at the moment this does not have any
effect on other cpus. It is not enabled for generic.

Thanks,
Kugan

Re: [RFC][AARCH64][PATCH 5/5] add aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop

Reply via email to