Re: [arm] GCC validation: preferred way of running the testsuite?

Christophe Lyon via Gcc Tue, 26 May 2020 10:05:02 -0700

On Tue, 19 May 2020 at 13:28, Richard Earnshaw
<[email protected]> wrote:
>
> On 11/05/2020 17:43, Christophe Lyon via Gcc wrote:
> > Hi,
> >
> >
> > As you may know, I've been running validations of GCC trunk in many
> > configurations for Arm and Aarch64.
> >
> >
> > I was recently trying to make some cleanup in the new Bfloat16, MVE, CDE, 
> > and
> > ACLE tests because in several configurations I see 300-400 FAILs
> > mainly in these areas, because of “testisms”. The goal is to avoid
> > wasting time over the same failure reports when checking what needs
> > fixing. I thought this would be quick & easy, but this is tedious
> > because of the numerous combinations of options and configurations
> > available on Arm.
> >
> >
> > Sorry for the very long email, it’s hard to describe and summarize,
> > but I'd like to try nonetheless, hoping that we can make testing
> > easier/more efficient :-), because most of the time the problems I
> > found are with the tests rather than real compiler bugs, so I think
> > it's a bit of wasted time.
> >
> >
> > Here is a list of problems, starting with the tricky dependencies
> > around -mfloat-abi=XXX:
> >
> > * Some targets do not support multilibs (eg arm-linux-gnueabi[hf] with
> > glibc), or one can decide not to build with both hard and soft FP
> > multilibs. This generally becomes a problem when including stdint.h
> > (used by arm_neon.h, arm_acle.h, …), leading to a compiler error for
> > lack of gnu/stub*.h for the missing float-abi. If you add -mthumb to
> > the picture, it becomes quite complex (eg -mfloat-abi=hard is not
> > supported on thumb-1).
> >
> >
> > Consider mytest.c that does not depend on any include file and has:
> > /* { dg-options "-mfloat-abi=hard" } */
> >
> > If GCC is configured for arm-linux-gnueabi --with-cpu=cortex-a9 
> > --with-fpu=neon,
> > with ‘make check’, the test PASSes.
> > With ‘make check’ with --target-board=-march=armv5t/-mthumb, then the
> > test FAILs:
> > sorry, unimplemented: Thumb-1 hard-float VFP ABI
> >
> >
> > If I add
> > /* { dg-require-effective-target arm_hard_ok } */
> > ‘make check’ with --target-board=-march=armv5t/-mthumb is now
> > UNSUPPORTED (which is OK), but
> > plain ‘make check’ is now also UNSUPPORTED because arm_hard_ok detects
> > that we lack the -mfloat-abi=hard multilib. So we lose a PASS.
> >
> > If I configure GCC for arm-linux-gnueabihf, then:
> > ‘make check’ PASSes
> > ‘make check’ with --target-board=-march=armv5t/-mthumb, FAILs
> > and with
> > /* { dg-require-effective-target arm_hard_ok } */
> > ‘make check’ with --target-board=-march=armv5t/-mthumb is now UNSUPPORTED 
> > and
> > plain ‘make check’ PASSes
> >
> > So it seems the best option is to add
> > /* { dg-require-effective-target arm_hard_ok } */
> > although it makes the test UNSUPPORTED by arm-linux-gnueabi even in
> > cases where it could PASS.
> >
> > Is there consensus that this is the right way?
> >
> >
> >
> > * In GCC DejaGnu helpers, the queries for -mfloat-abi=hard and
> > -march=XXX are independent in general, meaning if you query for
> > -mfloat-abi=hard support, it will do that in the absence of any
> > -march=XXX that the testcase may also be using. So, if GCC is
> > configured with its default cpu/fpu, -mfloat-abi=hard will be rejected
> > for lack of an fpu on the default cpu, but if GCC is configured with a
> > suitable cpu/fpu pair, -mfloat-abi=hard will be accepted.
> >
> > I faced this problem when I tried to “fix” the order in which we try 
> > options in
> > Arm_v8_2a_bf16_neon_ok. (see
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544654.html)
> >
> > I faced similar problems while working on a patch of mine about a bug
> > with IRQ handlers which has different behaviour depending on the FP
> > ABI used: I have the feeling that I spend too much time writing the
> > tests to the detriment of the patch itself...
> >
> > I also noticed that Richard Sandiford probably faced similar issues
> > with his recent fix for "no_unique_address", where he finally added
> > arm_arch_v8a_hard_ok to check arm8v-a CPU + neon-fp-armv8 FPU +
> > float-abi=hard at the same time.
> >
> > Maybe we could decide on a consistent and simpler way of checking such 
> > things?
> >
> >
> > * A metric for this complexity could be the number of arm
> > effective-targets, a quick and not-fully accurate grep | sed | sort |
> > uniq -c | sort -n on target-supports.exp ends with:
> >      9 mips
> >      16 aarch64
> >      21 powerpc
> >      97 vect
> >     106 arm
> > (does not count all the effective-targets generated by tcl code, eg
> > arm_arch_FUNC_ok)
> >
> > This probably explains why it’s hard to get test directives right :-)
> >
> > I’ve not thought about how we could reduce that number….
> >
> >
> >
> > * Finally, I’m wondering about the most appropriate way of configuring
> > GCC and running the tests.
> >
> > So far, for most of the configurations I'm testing, I use different
> > --with-cpu/--with-fpu/--with-mode configure flags for each toolchain
> > configuration I’m testing and rarely override the flags at testing
> > time. I also disable multilibs to save build time and (scratch) disk
> > space. (See 
> > https://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/0latest/report-build-info.html
> > for the current list, each line corresponds to a clean build + make
> > check job -- so there are 15 different toolchain configs for
> > arm-linux-gnueabihf for instance)
> >
> > However, I think this is may not be appropriate at least for the
> > arm-eabi toolchains, because I suspect the vendors who support several
> > SoCs generally ship one binary toolchain built with the default
> > cpu/fpu/mode and the appropriate multilibs (aprofile or rmprofile),
> > and the associated IDE adds the right -mcpu/-mfpu flags (see
> > arm-embedded toolchain, ST CubeMX for stm32). So it seems to me that
> > the "appropriate" way of testing such a toolchain is to build it with
> > the default settings and appropriate multilibs and add the needed
> > -mcpu/-mfpu variants at 'make check' time.
> >
> > I would still build one toolchain per configuration I want to test and
> > not use runtest’s capability to iterate over several combinations:
> > this way I can run the tests in parallel and reduce the total time
> > needed to get the results.
> >
> > One can compare the results of both options with the two lines with
> > cortex-m33 in the above table (target arm-none-eabi).
> >
> > In the first one, GCC is configured for cortex-m33, and tests executed
> > via plain ‘make check’: 401 failures in gcc. (duration ~2h, disk space
> > 14GB)
> >
> > In the 2nd line, GCC is configured with the default cpu/fpu, multilibs
> > enabled and I use test flags suitable for cortex-m33: now only 73
> > failures for gcc. (duration ~3h15, disk space 26GB). Note that there
> > are more failures for g++ and libstdc++ than for the previous line, I
> > haven’t fully checked why -- for libstdc++ there are spurious
> > -march=armv8-m.main+fp flags in the log. So this is not the magic
> > bullet.
> >
> >
> > Unfortunately, this means every test with arm_hard_ok effective target
> > would be unsupported (lack of fpu on default cpu) whatever the
> > validation cflags. The increased build time (many multilibs built for
> > nothing) will also reduce the validation bandwidth (I hope the
> > increased scratch disk space will not be a problem with my IT…)
> >
> >
> >
> > OTOH, I have a feeling that arm-linux-gnueabi* toolchain vendors
> > probably prefer to tune them for their preferred default CPU. For
> > instance I have an arm board running Ubuntu with gcc-5.4 configured
> > --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard
> > --with-mode=thumb.
> >
> > If this is right, it would mean I should keep the configurations I
> > currently use for arm-linux* (no multilib, rely on default cpu/fpu).
> >
> > ** Regarding the flags used for testing, I’m also wondering what’s the
> > most appropriate: -mcpu or -march. Both have probably pros and cons?
> >
> > In https://gcc.gnu.org/pipermail/gcc/2019-September/230258.html, I
> > described a problem where it seems that one expects the tests to run
> > with -march=XXX.
> >
> > Another log of mine has an effective-target helper compiled with:
> > -mthumb -mcpu=cortex-m33 -mfloat-abi=hard -mfloat-abi=softfp
> > -mfpu=auto -march=armv8.1-m.main+mve.fp -mthumb
> > which produces this error:
> > cc1: warning: switch '-mcpu=cortex-m33' conflicts with
> > '-march=armv8.1-m.main' switch
> > which looks suspicious: running the tests in multiple ways surely
> > helps uncovering bugs….
> >
> >
> > In summary, I’d like to gather opinions on:
> > * appropriate usage of dg-require-effective-target arm_hard_ok
> > * how to improve float-abi support detection in combination with
> > architecture level
> > * hopefully consensus on choosing how to configure the toolchain and
> > run the tests. I’m suggesting default config + multilibs +
> > runtest-flags for arm-eabi and a selection of default cpu/fpu + less
> > runtest-flags for arm-linux*.
> >
> >
> > Thanks for reading that far :-)
> >
> >
> > Christophe
> >
>


Thanks för your anwer.


> I've been pondering this for some time now (well before you sent your mail).
>
> My feeling is that trying to control this via dejagnu options is just
> getting too fiddly.  Perhaps a new approach is called for.
>
> My thoughts are along the line of reworking the tests to use
>
>   #pragma target <option>
>
> etc (or the attribute equivalent), to set the compilation state to
> something appropriate for the test so that the output is reasonable for
> that and then we can stabilize the test.
>
> It only works for assembly tests, not for anything that requires linking
> or execution: but for those tests we shouldn't be looking for a specific
> output but a specific behaviour and we can tolerate more variation in
> the instructions that implement that behaviour (hybrid tests would need
> splitting).

I'm not sure to fully understand what you mean: if we add #pragma CPU XXX
to a test for instance, and then run the tests with -mcpu=YYY, then
the test will still be compiled for XXX, right?
How would we detect that the generated code is wrong if compiling for YYY?

>
> It's a fair amount of work, though, since many of the required options
> cannot be controlled today via the attributes.  It's also not entirely
Indeed!

Not to mention that we would also have to decorate the many existing tests.

> clear whether these should be exposed to users, since in most cases such
> control is unlikely to be of use in real code.
Probably indeed.

For the record, I've changed the way I run the validations for
arm-eabi as I described in my original email:
I now use the default cpu/fpu/mode at GCC configure time, enable the
relevant multilibs then override the compilation flags when running
the tests.

For instance: -mthumb/-mcpu=cortex-m33/-mfloat-abi=hard

The number of failures is now lower than it used to be when
configuring --with-cpu=cortex-m33.

Christophe

Re: [arm] GCC validation: preferred way of running the testsuite?

Reply via email to