https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81828

            Bug ID: 81828
           Summary: Cilkplus performance regression on ARM...
           Product: gcc
           Version: 7.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ejolson at unr dot edu
  Target Milestone: ---

Created attachment 41979
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41979&action=edit
Graph showing performance regression...

Code for gcc version 7.1 using Cilkplus parallel programming extensions on ARM
is running much slower than the same code with version 6.2.  Details may by
viewed graphically as

    http://fractal.math.unr.edu/~ejolson/bench/dotprod/gcc71-8.png

which consistently shows a loss of performance using any combination of 1 to 8
cores on a Samsung/Nexell S5P6818 based SBC.  More information and example code
is available at

    https://www.raspberrypi.org/forums/viewtopic.php?p=711196#p1197225

My impression is that this regression affects almost all Cilkplus code on ARM
and is possibly the result unaligned cactus stack additional overhead in
switching tasks that was not present in the 6.2 version.  It is likely that
performance-based tests for ARM Cilkplus are needed to insure such regressions
do not happen in the future.  Note that the performance of serial code is not
affected.

The test code was compiled for 32-bit mode using options

    -fcilkplus -O3 -mcpu=cortex-a7 -mfpu=neon-vfpv4 -mfloat-abi=hard
-ffast-math

and run under identical circumstances in both cases.

Reply via email to