On 05/07/2018 12:15 PM, H.J. Lu wrote:
On Mon, May 7, 2018 at 7:09 AM, Luis Machado <luis.mach...@linaro.org> wrote:


On 05/01/2018 03:30 PM, Jeff Law wrote:

On 01/22/2018 06:46 AM, Luis Machado wrote:

This patch adds a new option to control the minimum stride, for a memory
reference, after which the loop prefetch pass may issue software prefetch
hints for. There are two motivations:

* Make the pass less aggressive, only issuing prefetch hints for bigger
strides
that are more likely to benefit from prefetching. I've noticed a case in
cpu2017
where we were issuing thousands of hints, for example.

* For processors that have a hardware prefetcher, like Falkor, it allows
the
loop prefetch pass to defer prefetching of smaller (less than the
threshold)
strides to the hardware prefetcher instead. This prevents conflicts
between
the software prefetcher and the hardware prefetcher.

I've noticed considerable reduction in the number of prefetch hints and
slightly positive performance numbers. This aligns GCC and LLVM in terms
of
prefetch behavior for Falkor.

The default settings should guarantee no changes for existing targets.
Those
are free to tweak the settings as necessary.

No regressions in the testsuite and bootstrapped ok on aarch64-linux.

Ok?

2018-01-22  Luis Machado  <luis.mach...@linaro.org>

         Introduce option to limit software prefetching to known constant
         strides above a specific threshold with the goal of preventing
         conflicts with a hardware prefetcher.

         gcc/
         * config/aarch64/aarch64-protos.h (cpu_prefetch_tune)
         <minimum_stride>: New const int field.
         * config/aarch64/aarch64.c (generic_prefetch_tune): Update to
include
         minimum_stride field.
         (exynosm1_prefetch_tune): Likewise.
         (thunderxt88_prefetch_tune): Likewise.
         (thunderx_prefetch_tune): Likewise.
         (thunderx2t99_prefetch_tune): Likewise.
         (qdf24xx_prefetch_tune): Likewise. Set minimum_stride to 2048.
         (aarch64_override_options_internal): Update to set
         PARAM_PREFETCH_MINIMUM_STRIDE.
         * doc/invoke.texi (prefetch-minimum-stride): Document new option.
         * params.def (PARAM_PREFETCH_MINIMUM_STRIDE): New.
         * params.h (PARAM_PREFETCH_MINIMUM_STRIDE): Define.
         * tree-ssa-loop-prefetch.c (should_issue_prefetch_p): Return
false if
         stride is constant and is below the minimum stride threshold.

OK for the trunk.
jeff


Thanks. Committed as revision 259995 now.

This breaks bootstrap on x86:

../../src-trunk/gcc/tree-ssa-loop-prefetch.c: In function ‘bool
should_issue_prefetch_p(mem_ref*)’:
../../src-trunk/gcc/tree-ssa-loop-prefetch.c:1010:54: error:
comparison of integer expressions of different signedness: ‘long long
unsigned int’ and ‘int’ [-Werror=sign-compare]
        && absu_hwi (int_cst_value (ref->group->step)) < 
PREFETCH_MINIMUM_STRIDE)
../../src-trunk/gcc/tree-ssa-loop-prefetch.c:1014:4: error: format
‘%d’ expects argument of type ‘int’, but argument 5 has type ‘long
long int’ [-Werror=format=]
     "Step for reference %u:%u (%d) is less than the mininum "
     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     "required stride of %d\n",
     ~~~~~~~~~~~~~~~~~~~~~~~~~
     ref->group->uid, ref->uid, int_cst_value (ref->group->step),
                                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


I've reverted this for now while i address the bootstrap problem.

Reply via email to