On 28/08/2020 15:27, Richard Earnshaw wrote: > On 28/08/2020 14:24, Christophe Lyon via Gcc-patches wrote: >> On Fri, 28 Aug 2020 at 14:00, Richard Earnshaw >> <richard.earns...@foss.arm.com> wrote: >>> >>> On 27/08/2020 14:27, Christophe Lyon via Gcc-patches wrote: >>>> In comment 14 from PR94538, it was suggested to switch off jump tables >>>> on thumb-1 cores when using -mpure-code, like we already do for thumb-2. >>>> >>>> This is what this patch does, and also restores the previous value of >>>> CASE_VECTOR_PC_RELATIVE since it was not the right way of handling >>>> this. >>>> >>>> It also adds a new test, since the existing no-casesi.c did not catch >>>> this problem. >>>> >>>> Tested by running the whole testsuite with -mpure-code -mcpu=cortex-m0 >>>> -mfloat-abi=soft, no regression and the new test passes (and fails >>>> without the fix). >>>> >>>> 2020-08-27 Christophe Lyon <christophe.l...@linaro.org> >>>> >>>> gcc/ >>>> * config/arm/arm.h (CASE_VECTOR_PC_RELATIVE): Remove condition on >>>> -mpure-code. >>>> * config/arm/thumb1.md (tablejump): Disable when -mpure-code. >>>> >>>> gcc/testsuite/ >>>> * gcc.target/arm/pure-code/pr96768.c: New test. >>>> --- >>>> gcc/config/arm/arm.h | 5 ++--- >>>> gcc/config/arm/thumb1.md | 2 +- >>>> gcc/testsuite/gcc.target/arm/pure-code/pr96768.c | 21 >>>> +++++++++++++++++++++ >>>> 3 files changed, 24 insertions(+), 4 deletions(-) >>>> create mode 100644 gcc/testsuite/gcc.target/arm/pure-code/pr96768.c >>>> >>>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h >>>> index 3887c51..7d43721 100644 >>>> --- a/gcc/config/arm/arm.h >>>> +++ b/gcc/config/arm/arm.h >>>> @@ -1975,10 +1975,9 @@ enum arm_auto_incmodes >>>> for the index in the tablejump instruction. */ >>>> #define CASE_VECTOR_MODE Pmode >>>> >>>> -#define CASE_VECTOR_PC_RELATIVE ((TARGET_THUMB2 >>>> \ >>>> +#define CASE_VECTOR_PC_RELATIVE (TARGET_THUMB2 >>>> \ >>>> || (TARGET_THUMB1 \ >>>> - && (optimize_size || flag_pic))) \ >>>> - && (!target_pure_code)) >>>> + && (optimize_size || flag_pic))) >>>> >>>> >>>> #define CASE_VECTOR_SHORTEN_MODE(min, max, body) \ >>>> diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md >>>> index f0129db..602039e 100644 >>>> --- a/gcc/config/arm/thumb1.md >>>> +++ b/gcc/config/arm/thumb1.md >>>> @@ -2012,7 +2012,7 @@ (define_insn "*epilogue_insns" >>>> (define_expand "tablejump" >>>> [(parallel [(set (pc) (match_operand:SI 0 "register_operand")) >>>> (use (label_ref (match_operand 1 "" "")))])] >>>> - "TARGET_THUMB1" >>>> + "TARGET_THUMB1 && !target_pure_code" >>>> " >>>> if (flag_pic) >>>> { >>>> diff --git a/gcc/testsuite/gcc.target/arm/pure-code/pr96768.c >>>> b/gcc/testsuite/gcc.target/arm/pure-code/pr96768.c >>>> new file mode 100644 >>>> index 0000000..fd4cad5 >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.target/arm/pure-code/pr96768.c >>>> @@ -0,0 +1,21 @@ >>>> +/* { dg-do compile } */ >>>> +/* { dg-options "-mpure-code" } */ >>>> + >>>> +int f2 (int x, int y) >>>> +{ >>>> + switch (x) >>>> + { >>>> + case 0: return y + 0; >>>> + case 1: return y + 1; >>>> + case 2: return y + 2; >>>> + case 3: return y + 3; >>>> + case 4: return y + 4; >>>> + case 5: return y + 5; >>>> + } >>>> + return y; >>>> +} >>>> + >>>> +/* We do not want any load from literal pool, but accept loads from r7 >>>> + (frame pointer, used at -O0). */ >>>> +/* { dg-final { scan-assembler-not "ldr\tr\\d+, \\\[r\[0-689\]+\\\]" } } >>>> */ >>>> +/* { dg-final { scan-assembler "text,\"0x20000006\"" } } */ >>>> >>> >>> Having switch tables is only a problem if they are embedded in the .text >>> segment. But the case you show in the PR has them in the .rodata >>> segment, so is this really necessary? >>> >> >> Well, in the original PR94538, comment #14, Wilco said this was the best >> option. >> > > If the choice were not really a choice (ie pure code could not use > switch tables and still be pure) yes, it would be the case. But that > isn't the case. > > GCC already knows generally when using jump sequences is going to be > better than switch tables and when tables are likely to be better. It > can even produce a meld of the two when appropriate, it can even take > into account whether or not we are optimizing for size. So we shouldn't > be just ride rough-shod over those choices. Pure code doesn't really > change the balance. > > R. > >>> R. >
Also note that it would be possible to compress the data tables in a similar way to the aarch64 port; there's no need for the entries to be 32 bits most of the time. It just needs someone to do the work. R.