https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79742

            Bug ID: 79742
           Summary: ARM sched pipeline selection problems
           Product: gcc
           Version: 7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wilson at gcc dot gnu.org
  Target Milestone: ---

I noticed an arm port scheduler problem.  I believe it was introduced by
Richard Earnshaw's patch to add the arm-cpus.in file.  There are also some
apparent latent problems exposed by that patch.

The main problem is that the "tune for" lines for cpu entries are not working. 
In the parsecpu.awk script, cpu_tune_for is set, but never used.  This means
that cpu entries that use tune for lines are not using the desired scheduler
pipeline.

With the testcase
int sub1 (int i, int j) { return i + j; }
float sub2 (float i, float j) { return i + j; }

Compiling with 
./xgcc -B./ -O2 -mcpu=cortex-a12 -fsched-verbose=9 -fdump-rtl-sched1-all -S
tmp.c
and looking at the sched1 dump file, I see for the int add
;;      |   12 |    0 | r0=r0+r1                       nothing
and for the float add I see
;;      |   12 |    0 | s0=s0+s1                       fmac
So this is using no pipeline for integer code, and the vfp11.md pipeline for
float code.  This should be using the cortex-a17 pipline.  If I compile instead
with -mcpu=cortex-a72, I see for the int add
;;      |   12 |    0 | r0=r0+r1                       core
and for the float add I see
;;      |   12 |    0 | s0=s0+s1                       core*32
So this is using the arm-generic pipeline for both the int and float add.  This
should be using the cortex-a57 pipeline.  Both of these work correctly in GCC
6.

I see two apparent latent issues here.  The generic_sched and generic_vfp
attributes in the arm.md file haven't been updated properly and some targets
may be using them accidentally.  All targets with their own pipeline should be
added to the exclusion lists for these attributes.

Also, there is a conflict between generic_sched and generic_vfp pipelines. 
arm-generic.md has a rule to match multi cycle instructions, which also happens
to match FP instructions.  So it is not possible to use both generic_sched and
generic_vfp, as FP instructions will use the multi cycle rule in arm-generic.md
instead of the appropriate rule in vfp11.md.  This can be seen in the above
results, where compiling for cortex-a12 uses the fmac rule from the vfp11.md
file for a float add because it is in the exclusion list for generic_sched. 
But compiling for cortex-a72 uses the core*32 rule from the arm-generic.md file
for a float add because it is not in the exclusion list for generic_sched.  Of
course cortex-a12 and cortex-a72 should not be using the generic rules, but it
does appear that there are some targets that are using both generic_sched and
generic_vfp, such as the armv6 parts like mpcore.

Reply via email to