Re: [PATCH 00/30] [ARM] Reworking the -mcpu, -march and -mfpu options

Christophe Lyon Tue, 13 Jun 2017 09:09:06 -0700

On 13 June 2017 at 17:25, Richard Earnshaw (lists)
<richard.earns...@arm.com> wrote:
> On 12/06/17 15:34, Richard Earnshaw (lists) wrote:
>> On 12/06/17 12:49, Christophe Lyon wrote:
>>> On 10 June 2017 at 01:27, Richard Earnshaw (lists)
>>> <richard.earns...@arm.com> wrote:
>>>> On 09/06/17 23:45, Christophe Lyon wrote:
>>>>> Hi Richard,
>>>>>
>>>>>
>>>>> On 9 June 2017 at 14:53, Richard Earnshaw <richard.earns...@arm.com> 
>>>>> wrote:
>>>>>>
>>>>>> During the ARM BoF at the Cauldron last year I mentioned that I wanted
>>>>>> to rework the way GCC on ARM handles the command line options.  The
>>>>>> problem was that most users, and even many experts, can't remember
>>>>>> which FPU/SIMD unit comes with which CPU and that consequently many
>>>>>> users were inadvertenly generating sub-optimal code for their system.
>>>>>>
>>>>>> This patch series implements the proposed change and provides support
>>>>>> for a generic way of adding optional features to architectures and CPU
>>>>>> names.  The documentation patches at the end of the series explain the
>>>>>> new syntax, so I won't repeat all that here.  Suffice to say here that
>>>>>> the result is that the -mfpu option now defaults to 'auto', which
>>>>>> allows the compiler to infer the floating-point and simd options from
>>>>>> the CPU/architecture options and that these options can normally be
>>>>>> expressed in a context-specific manner like +simd or +fp without
>>>>>> having to know precisely which variant is implemented.  Long term I'd
>>>>>> like to deprecate -mfpu and entirely move over to the new syntax; but
>>>>>> it's too early to start that process now.
>>>>>>
>>>>>> All the patches in the series should build a working basic compiler,
>>>>>> but the multilib selection will not work correctly until the relevant
>>>>>> patches towards the end are applied.  It is not really feasible to
>>>>>> retain that functionality without collapsing too many of the patches
>>>>>> together into one hunk.  It's also possible that some tests in the
>>>>>> testsuite may exhibit transient misbehaviour, but there should be no
>>>>>> regressions by the end of the sequence (some tests no-longer run in
>>>>>> the default configurations because the default CPU does not have
>>>>>> floating-point support).
>>>>>>
>>>>>> Just two patches are to the generic code, but both are fairly trivial.
>>>>>> One permits the sbitmap code to be used in the driver programs and the
>>>>>> other provides a way of escaping the meta-character in some multilib
>>>>>> reuse strings.
>>>>>>
>>>>>> I won't apply any of this series until those two patches have been
>>>>>> approved, and I won't commit anything before the middle of next week
>>>>>> even then.  This is a fairly complex change and it deserves some time
>>>>>> for people to comment before committing.
>>>>>>
>>>>>> R.
>>>>>>
>>>>>> Richard Earnshaw (30):
>>>>>>   [arm] Use strings for -march, -mcpu and -mtune options
>>>>>>   [arm] Rewrite -march and -mcpu options for passing to the assembler
>>>>>>   [arm] Don't pass -mfpu=auto through to the assembler.
>>>>>>   [arm] Allow +opt on arbitrary cpu and architecture specifications
>>>>>>   [arm] Add architectural options
>>>>>>   [arm] Add default FPUs for CPUs.
>>>>>>   [build] Make sbitmap code available to the driver programs
>>>>>>   [arm] Split CPU, architecture and tuning data tables.
>>>>>>   [ARM] Move cpu and architecture option name parsing code to
>>>>>>     arm-common.c
>>>>>>   [arm] Use standard option parsing code for detecting thumb-only
>>>>>>     targets
>>>>>>   [arm] Allow CPU and architecture extensions to be defined as aliases
>>>>>>   [arm] Allow new extended syntax CPU and architecture names during
>>>>>>     configure
>>>>>>   [arm] Force a CPU default in the config args defaults list.
>>>>>>   [arm] Generate a canonical form for -march
>>>>>>   [arm] Make -mfloat-abi=softfp work when there are no FPU instructions
>>>>>>   [arm] Update basic multilib configuration
>>>>>>   [arm] Make 'auto' the default FPU selection option.
>>>>>>   [arm] Rewrite t-aprofile using new selector methodology
>>>>>>   [arm] Explicitly set .fpu in cmse_nonsecure_call.S
>>>>>>   [genmultilib] Allow explicit periods to be escaped in MULTILIB_REUSE
>>>>>>   [arm][testsuite] Use -march=armv7-a+fp when testing hard-float ABI.
>>>>>>   [arm] Rewrite t-rmprofile multilib specification
>>>>>>   [arm][rtems] Update t-rtems for new option framework
>>>>>>   [arm][linux-eabi] Ensure all multilib variables are reset
>>>>>>   [arm][phoenix] reset all multilib variables
>>>>>>   [arm] Rework multlib builds for symbianelf
>>>>>>   [arm][fuchsia] Rework multilib support
>>>>>>   [arm] Add a few missing architecture extension options.
>>>>>>   [arm][doc] Document new -march= syntax.
>>>>>>   [arm][doc] Document changes to -mcpu, -mtune and -mfpu.
>>>>>>
>>>>>>  gcc/Makefile.in                           |    2 +-
>>>>>>  gcc/common/config/arm/arm-common.c        |  651 +++++++-
>>>>>>  gcc/config.gcc                            |   17 +-
>>>>>>  gcc/config/arm/arm-builtins.c             |    4 +-
>>>>>>  gcc/config/arm/arm-cpu-cdata.h            | 2444 
>>>>>> +++++++++++++++++++++++------
>>>>>>  gcc/config/arm/arm-cpu-data.h             | 1410 ++---------------
>>>>>>  gcc/config/arm/arm-cpu.h                  |   38 +
>>>>>>  gcc/config/arm/arm-cpus.in                |  237 ++-
>>>>>>  gcc/config/arm/arm-isa.h                  |   20 +-
>>>>>>  gcc/config/arm/arm-protos.h               |   56 +-
>>>>>>  gcc/config/arm/arm-tables.opt             |   21 +-
>>>>>>  gcc/config/arm/arm.c                      |  337 ++--
>>>>>>  gcc/config/arm/arm.h                      |   75 +-
>>>>>>  gcc/config/arm/arm.opt                    |   15 +-
>>>>>>  gcc/config/arm/bpabi.h                    |    4 -
>>>>>>  gcc/config/arm/elf.h                      |    6 +-
>>>>>>  gcc/config/arm/linux-elf.h                |    3 -
>>>>>>  gcc/config/arm/netbsd-elf.h               |    4 -
>>>>>>  gcc/config/arm/parsecpu.awk               |  295 +++-
>>>>>>  gcc/config/arm/t-aprofile                 |  200 +--
>>>>>>  gcc/config/arm/t-arm-elf                  |  173 +-
>>>>>>  gcc/config/arm/t-fuchsia                  |   33 +
>>>>>>  gcc/config/arm/t-linux-eabi               |    4 +
>>>>>>  gcc/config/arm/t-multilib                 |  126 +-
>>>>>>  gcc/config/arm/t-phoenix                  |   20 +-
>>>>>>  gcc/config/arm/t-rmprofile                |  147 +-
>>>>>>  gcc/config/arm/t-rtems                    |   49 +-
>>>>>>  gcc/config/arm/t-symbian                  |   34 +-
>>>>>>  gcc/config/arm/vxworks.h                  |    2 -
>>>>>>  gcc/doc/fragments.texi                    |   10 +-
>>>>>>  gcc/doc/invoke.texi                       |  371 ++++-
>>>>>>  gcc/genmultilib                           |    4 +-
>>>>>>  gcc/testsuite/gcc.dg/pr59418.c            |    2 +-
>>>>>>  gcc/testsuite/gcc.target/arm/multilib.exp |  685 ++++++++
>>>>>>  gcc/testsuite/gcc.target/arm/pr51915.c    |    2 +-
>>>>>>  gcc/testsuite/gcc.target/arm/pr52006.c    |    2 +-
>>>>>>  gcc/testsuite/gcc.target/arm/pr53187.c    |    2 +-
>>>>>>  libgcc/config/arm/cmse_nonsecure_call.S   |    8 +
>>>>>>  38 files changed, 5073 insertions(+), 2440 deletions(-)
>>>>>>  create mode 100644 gcc/config/arm/t-fuchsia
>>>>>>  create mode 100644 gcc/testsuite/gcc.target/arm/multilib.exp
>>>>>>
>>>>>> ----------------2.7.4--
>>>>>>
>>>>>
>>>>> I wanted to run a validation with the full series applied as one patch
>>>>> over r249050,
>>>>> so I just downloaded all the patches, concatenated them in order, but the 
>>>>> result
>>>>> fails to apply. (conflicts with arm-cpu-cdata.h, arm-cpus.in,
>>>>> t-aprofile, t-rmprofile)
>>>>>
>>>>> Am I missing something?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Christophe
>>>>>
>>>>
>>>> I really appreciate you trying to do this...
>>>>
>>>> I rebased the patch series sometime on Tuesday/Wednesday just before Jim
>>>> Wilson committed his falkor change; you'll need to back out to just
>>>> before r248944.  I can't think of anything else that might have just
>>>> gone in that might conflict.  For arm-cpu-cdata.h (and the other
>>>> auto-generated files, like arm-cpu-data.h) you can just delete the file
>>>> and it will be rebuilt automatically, the others really do need the
>>>> conflict to be resolved.
>>>>
>>>> Obviously I'll rebase again just before the commit, but propagating such
>>>> changes through the patch series is quite tedious and I don't want to do
>>>> it any more than necessary :-).
>>>>
>>>> It would be easier if the generated files weren't checked in to the
>>>> repository...
>>>>
>>>> R.
>>>
>>> Hi Richard,
>>>
>>> Starting a validation of the whole patch against r248942 did work, thanks.
>>>
>>> The results are here:
>>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/248942-rework-cpu-arch-fpu.patch/report-build-info.html
>>>
>>> Regressions are detected in many configurations, unfortunately.
>>> Some may be caused by the fact that I've upgraded to dejagnu-1.6+ for
>>> my testing,
>>> and thus "multilib flags" are now prepended rather than appended, but
>>> I don't think that's the majority.
>>>
>>> To help you understand/group all the reports:
>>> - all "arm-none-linux-gnueabi" with REGRESSED now have:
>>> FAIL: gcc.target/arm/neon-thumb2-move.c (test for excess errors)
>>
>> See later.
>>
>>> FAIL: gcc.target/arm/no-volatile-in-it.c scan-assembler-not ldrgt
>>> - same for the 2 configs arm-none-eabi --with-cpu=cortex-a9
>>
>> I'm checking a fix for this, I think I understand what's going on here.
>>
>>>
>>> - all "arm-none-linux-gnueabihf" and "armeb-none-linux-gnueabihf" with
>>> REGRESSED now have:
>>> FAIL: gcc.target/arm/no-volatile-in-it.c scan-assembler-not ldrgt
>>>
>>> - arm-none-eabi --with-cpu=cortex-m3 now has:
>>> FAIL:
>>>     gcc.target/arm/no-volatile-in-it.c scan-assembler-not ldrgt
>>>     gcc.target/arm/thumb2-slow-flash-data-2.c (test for excess errors)
>>>     gcc.target/arm/thumb2-slow-flash-data-3.c (test for excess errors)
>>>     gcc.target/arm/thumb2-slow-flash-data-4.c (test for excess errors)
>>>     gcc.target/arm/thumb2-slow-flash-data-5.c (test for excess errors)
>>> UNRESOLVED:
>>>     gcc.target/arm/thumb2-slow-flash-data-4.c scan-assembler-times 
>>> #1\\.0e\\+0 3
>>>     gcc.target/arm/thumb2-slow-flash-data-5.c scan-assembler-not #1\\.0e\\+0
>>>
>>> - The 2 cells with "BIG-REG" mean regressions where detected, but the report
>>> is >100kb so it was attached as a .xz file, click on 'BIG-REG" to download 
>>> it.
>>>
>>> - the cells with "BETTER" can be questionable: it means no new failure
>>> appeared, but PASS -> UNSUPPORTED is considered as "better". Here
>>> we have cases with several thousands (!) of tests becoming unsupported,
>>> which looks a bit suspicious.
>>>
>>> Among other things, I've noticed that when passing -march=armv5t,
>>> arm_neon_ok fails:
>>> xgcc -march=armv5t -fno-diagnostics-show-caret
>>> -fdiagnostics-color=never -mfpu=neon -mfloat-abi=softfp -march=
>>> armv7-a -c -o arm_neon_ok15574.o arm_neon_ok15574.c
>>> arm_neon_ok15574.c:9:4: error: #error Architecture does not support NEON.
>>> compiler exited with status 1
>>>
>>> I'd expect -march=armv7-a to allow to use -mfpu=neon?
>>>
>>
>> It does.  The problem seems to be a generic one in the driver in that
>> the rewrite rules are always passed the first instance of -march and not
>> the last.  Indeed, with the aarch64 compiler, if I write
>>
>> gcc -mcpu=native -mcpu=cortex-a53
>>
>> then the driver will rewrite this as
>>
>> ./cc1 -mcpu=cortex-a53 -mcpu=<expansion of native cpu name>
>>
>> which doesn't seem to be the right thing at all.
>>
>> So we either have a generic problem with all option rewriting, or there
>> are some subtle details of it that we've not figured out yet.
>>
>
> It turns out that there's also a problem with the test framework for
> this particular test.
>
I'm not sure which particular test you are referring to?


> The test has:
>
> /* { dg-require-effective-target arm_neon_ok } */
> /* { dg-require-effective-target arm_thumb2_ok } */
> /* { dg-options "-O2 -mthumb -march=armv7-a" } */
> /* { dg-add-options arm_neon } */
>
> This combination of flags goes through the following process:
>
> firstly it checks that Neon is ok.  That's fine
> Then it checks for thumb2.  That's also fine
> Then it adds the extra options.
> Finally it adds any final options needed to enable Neon.
>
> Unfortunately, the last step uses a cache.  The cache is based on trying
> a number of options from the default option set, which inherits from the
> options built into the compiler, which in this case is -mcpu=cortex-a9.
>
> Now since -mcpu=cortex-a9 now implies neon, the generic framework
> deduces (correctly) that -mfloat-abi=softfp is sufficient to enable neon.
>
> Unfortunately, this is the point at which the dg-options statement
> messes things up - the -march=armv7-a overrides the internally selected
> default CPU (cortex-a9) and thus removes Neon (simd) from the
> instruction set, since we haven't specified an architecture that
> supports SIMD.
>
> I think the correct (second) fix for this test is to remove the -march
> option entirely from the dg-options list and then rely on dg-add-options
> arm_neon adding the correct flags for this target configuration.
>

Indeed, I think in general there should be no such -march=XXX option
in dg-options and that should be handled by the
dg-require-effective-target/dg-add-options pair.
Why was -march=armv7-a added to this test in the first place?

> R.
>

Re: [PATCH 00/30] [ARM] Reworking the -mcpu, -march and -mfpu options

Reply via email to