[Bug target/96847] Code size increase +42% depending on memory size allocated on stack for ARM Cortex-M3

2020-08-31 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96847

--- Comment #2 from Fredrik Hederstierna 
 ---
Ok thanks, just wanted also to clarify that the size increase was not actually
due to changing array sizes, but it was difference between GCC-9.2 and GCC-10.2
for the _same_ array lengths. So GCC-10.2 generated worse code then previous
GCC versions for this exact same code.

BR Fredrik

[Bug c/96847] New: Code size increase +42% depending on memory size allocated on stack for ARM Cortex-M3

2020-08-29 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96847

Bug ID: 96847
   Summary: Code size increase +42% depending on memory size
allocated on stack for ARM Cortex-M3
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fredrik.hederstie...@securitas-direct.com
  Target Milestone: ---

Created attachment 49156
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49156=edit
Example showing +42% increase depending on stack mem array sizes

When comping with GCC-10.x.0 I get a code size increase depending on the size
of memory for arrays on stack.

On older GCC-9.x.0 does not get this size increase.

On a slightly constructed test-case from CSiBE bzip2 I get more than +42% size
increase.

Target: arm-none-eabi Cortex-M3

See example attached, if I chose a 2 bytes less size for stack mem array I get
a totally different result? How can stack memory arrays sizes make this
difference, and why is this new with GCC-10.x?

[Bug analyzer/95152] ICE in get_or_create_mem_ref, at analyzer/region-model.cc:6938 since r10-5950-g757bf1dff5e8cee3

2020-05-18 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95152

--- Comment #4 from Fredrik Hederstierna 
 ---
Stripped down example:

File :


typedef struct {
  int var;
} info_t;
extern void *_data_offs;
void test()
{
  info_t *info = ((void *)((void *)1) + ((unsigned int)&_data_offs));
  my_func(info->var == 0);
}

Output:

during IPA pass: analyzer
test2.i:10:42: internal compiler error: in get_or_create_mem_ref, at
analyzer/region-model.cc:6938
   10 |   info_t *info = ((void *)((void *)1) + ((unsigned int)&_data_offs));
  | ~^~
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.


Compilation:

/opt/gcc/arm-none-eabi-toolchain-gcc-10.1.0-binutils-2.33.1-newlib-3.3.0-hardfloat/bin/arm-none-eabi-gcc
-fanalyzer -O2 -o test.o test.i

BR Fredrik

[Bug analyzer/95152] internal compiler error: in get_or_create_mem_ref, at analyzer/region-model.cc:6938

2020-05-16 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95152

Fredrik Hederstierna  changed:

   What|Removed |Added

 CC||fredrik.hederstierna@securi
   ||tas-direct.com

--- Comment #2 from Fredrik Hederstierna 
 ---
I have had the same problem with arm-none-eabi-gcc (GCC) 10.1.0, using
-fanalyzer.

Compiling my_test.c ..
during IPA pass: analyzer
falcon_fota.c: In function 'my_verify.part.0':
falcon_fota.c:629:5: internal compiler error: in get_or_create_mem_ref, at
analyzer/region-model.cc:6938
  629 | app_info_t *app_info = RAM_APP_INFO_POS;
  | ^
Please submit a full bug report,
with preprocessed source if appropriate.


The pointer RAM_APP_INFO_POS is a quite special reference with address of
external variable, and adding constants to that value:

// example
extern void *_app_data_offs;
#define RAM_ADDRESS  ((uint32_t)(0xC000))
#define APP_INFO_POS ((uint32_t)&_app_data_offs)
#define RAM_APP_INFO_POS (RAM_ADDRESS + APP_INFO_POS)


arm-eabi-gcc -v

Using built-in specs.
COLLECT_GCC=/opt/gcc/arm-none-eabi-toolchain-gcc-10.1.0-binutils-2.33.1-newlib-3.3.0-hardfloat/bin/arm-none-eabi-gcc
COLLECT_LTO_WRAPPER=/opt/gcc/arm-none-eabi-toolchain-gcc-10.1.0-binutils-2.33.1-newlib-3.3.0-hardfloat/libexec/gcc/arm-none-eabi/10.1.0/lto-wrapper
Target: arm-none-eabi
Configured with: ../../gcc-10.1.0/configure --enable-languages=c,c++
--target=arm-none-eabi
--prefix=/opt/gcc/arm-none-eabi-toolchain-gcc-10.1.0-binutils-2.33.1-newlib-3.3.0-hardfloat
--with-gnu-as --with-gnu-ld --with-newlib --with-system-zlib
--with-endian=little --disable-interwork --with-mode=thumb --with-abi=aapcs
--with-cpu=cortex-m4 --with-float=hard --with-fpu=fpv4-sp-d16 --disable-nls
--disable-libssp --enable-multilib
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 10.1.0 (GCC)

[Bug target/9663] [arm] gcc-20030127 misses an optimization opportunity

2019-06-06 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663

--- Comment #12 from Fredrik Hederstierna 
 ---
Created attachment 46458
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46458=edit
range ran through preprocessor using -E -P

I'm not sure if this is what you wanted, but here is file stand-alone
compile-able, without headers. Compiled with -E -P.
/Fredrik

[Bug target/9663] [arm] gcc-20030127 misses an optimization opportunity

2019-06-06 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663

Fredrik Hederstierna  changed:

   What|Removed |Added

 CC||fredrik.hederstierna@securi
   ||tas-direct.com

--- Comment #10 from Fredrik Hederstierna 
 ---
Created attachment 46457
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46457=edit
Testcase from CSiBE teem sources

Testcase from CSiBE teem sources
Tested with gcc-9.1.0 for ARM 32bit targets.

Without peephole2

 :
   0:   e92d407fpush{r0, r1, r2, r3, r4, r5, r6, lr}
   4:   e2504000subsr4, r0, #0
   8:   0a3fbeq 10c 
   c:   e351cmp r1, #0
  10:   e1a05001mov r5, r1

With peephole2

 :
   0:   e92d407fpush{r0, r1, r2, r3, r4, r5, r6, lr}
   4:   e2504000subsr4, r0, #0
   8:   0a3ebeq 108 
   c:   e2515000subsr5, r1, #0

/Fredrik

[Bug c/90705] New: Suboptimal register allocation on ARM when compiling for size

2019-06-01 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90705

Bug ID: 90705
   Summary: Suboptimal register allocation on ARM when compiling
for size
   Product: gcc
   Version: 9.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fredrik.hederstie...@securitas-direct.com
  Target Milestone: ---

Created attachment 46441
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46441=edit
test.c

When compiling this simple example for ARM (-mcpu=cortex-m0) with gcc-9.1.0,
the code generated looks ok when use -O2, but register allocations looks weird
when compiling use -Os. Registers are pushed on stack, and code actually gets
alot bigger.

Example

int k;
int test(int i)
{
  int r = 0;
  for (; i >= 0; i--) {
k = i;
r += k;
  }
  return r;
}


Compiling gcc-9.1.0, -mcpu=cortex-m0 using -O2:

 :
   0:   0003movsr3, r0
   2:   2000movsr0, #0
   4:   2b00cmp r3, #0
   6:   db05blt.n   14 
   8:   18c0addsr0, r0, r3
   a:   3b01subsr3, #1
   c:   d2fcbcs.n   8 
   e:   2200movsr2, #0
  10:   4b01ldr r3, [pc, #4]; (18 )
  12:   601astr r2, [r3, #0]
  14:   4770bx  lr
  16:   46c0nop ; (mov r8, r8)
  18:   .word   0x


but when compiling with same compiler with -Os:

 :
   0:   2200movsr2, #0
   2:   b530push{r4, r5, lr}
   4:   0003movsr3, r0
   6:   2501movsr5, #1
   8:   0010movsr0, r2
   a:   4906ldr r1, [pc, #24]   ; (24 )
   c:   680cldr r4, [r1, #0]
   e:   2b00cmp r3, #0
  10:   da03bge.n   1a 
  12:   2a00cmp r2, #0
  14:   d000beq.n   18 
  16:   600cstr r4, [r1, #0]
  18:   bd30pop {r4, r5, pc}
  1a:   001cmovsr4, r3
  1c:   18c0addsr0, r0, r3
  1e:   002amovsr2, r5
  20:   3b01subsr3, #1
  22:   e7f4b.n e 
  24:   .word   0x

using 2 more registers and stack, also code size significantly larger.

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

2019-05-22 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #6 from Fredrik Hederstierna 
 ---
Created attachment 46397
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46397=edit
Some more patterns

Looking into this I found some more places where it seems to be non-optimal
code, maybe separate issue, but these are also example of equal evaluation for
unsigned types ?

Test 1
  (x > y) || (x > (y / N))   equal to
  (x > (y / N))

Test 2
  (x > y) || (x > (y >> N))  equal to
  (x > (y >> N))

Test 3
  (x > y) && (x > (y / N))   equal to
  (x > y)

Test 4
  (x > y) && (x > (y >> N))  equal to
  (x > y)

One thing to consider here maybe that depending on optimizing for size or
speed, then the order of evaluation can be changed, so like if some operation
is costy, then it could be avoided to obtain higher speed possibly assuming it
will accept arguments prior in list I guess. But when optimizing for size, then
I think always the more simplified expression would apply?

Example for arm using above expressions,
(code attached)

 :
   0:   b510push{r4, lr}
   2:   000bmovsr3, r1
   4:   0004movsr4, r0
   6:   2001movsr0, #1
   8:   428ccmp r4, r1
   a:   d807bhi.n   1c 
   c:   2103movsr1, #3
   e:   0018movsr0, r3
  10:   f7ff fffe   bl  0 <__aeabi_uidiv>
  14:   b2c0uxtbr0, r0
  16:   42a0cmp r0, r4
  18:   4180sbcsr0, r0
  1a:   4240negsr0, r0
  1c:   bd10pop {r4, pc}

001e :
  1e:   b510push{r4, lr}
  20:   0004movsr4, r0
  22:   0008movsr0, r1
  24:   2103movsr1, #3
  26:   f7ff fffe   bl  0 <__aeabi_uidiv>
  2a:   b2c0uxtbr0, r0
  2c:   42a0cmp r0, r4
  2e:   4180sbcsr0, r0
  30:   4240negsr0, r0
  32:   bd10pop {r4, pc}

0034 :
  34:   0003movsr3, r0
  36:   2001movsr0, #1
  38:   428bcmp r3, r1
  3a:   d803bhi.n   44 
  3c:   08c9lsrsr1, r1, #3
  3e:   4299cmp r1, r3
  40:   4189sbcsr1, r1
  42:   4248negsr0, r1
  44:   4770bx  lr

0046 :
  46:   08c9lsrsr1, r1, #3
  48:   4281cmp r1, r0
  4a:   4180sbcsr0, r0
  4c:   4240negsr0, r0
  4e:   4770bx  lr

0050 :
  50:   b510push{r4, lr}
  52:   000bmovsr3, r1
  54:   0004movsr4, r0
  56:   2000movsr0, #0
  58:   428ccmp r4, r1
  5a:   d907bls.n   6c 
  5c:   2103movsr1, #3
  5e:   0018movsr0, r3
  60:   f7ff fffe   bl  0 <__aeabi_uidiv>
  64:   b2c0uxtbr0, r0
  66:   42a0cmp r0, r4
  68:   4180sbcsr0, r0
  6a:   4240negsr0, r0
  6c:   bd10pop {r4, pc}

006e :
  6e:   4281cmp r1, r0
  70:   4180sbcsr0, r0
  72:   4240negsr0, r0
  74:   4770bx  lr

0076 :
  76:   0003movsr3, r0
  78:   2000movsr0, #0
  7a:   428bcmp r3, r1
  7c:   d903bls.n   86 
  7e:   08c9lsrsr1, r1, #3
  80:   4299cmp r1, r3
  82:   4189sbcsr1, r1
  84:   4248negsr0, r1
  86:   4770bx  lr

0088 :
  88:   4281cmp r1, r0
  8a:   4180sbcsr0, r0
  8c:   4240negsr0, r0
  8e:   4770bx  lr

[Bug middle-end/90340] Not optimal code when compiling switch-case for size, code increase +35%

2019-05-14 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90340

--- Comment #22 from Fredrik Hederstierna 
 ---
Was "max_ratio_for_size = 2" as default changed?
Also changing this to "1" did not by far reach size of gcc-8.2 unfortunately,
I guess we are assuming this code growth depends on other regression from other
changes?

[Bug middle-end/90340] Not optimal code when compiling switch-case for size, code increase +35%

2019-05-13 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90340

--- Comment #16 from Fredrik Hederstierna 
 ---
Still you cannot reach code size as gcc-8.3.0 ? So something in new switch-case
compilation generates larger code still?

[Bug middle-end/90340] Not optimal code when compiling switch-case for size, code increase +35%

2019-05-06 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90340

--- Comment #10 from Fredrik Hederstierna 
 ---
Tested also gcc-9.1.0 "max_ratio_for_size = 1" just out of curiosity

results was similar compared to gcc-8.2.0:

Overall CSiBE was

2 417 695 bytes (+4185 bytes, +0.17%)

Example file CSiBE "vsprintf.c" was

2369 bytes (+76 bytes, +3.2%)

[Bug middle-end/90340] Not optimal code when compiling switch-case for size, code increase +35%

2019-05-06 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90340

--- Comment #9 from Fredrik Hederstierna 
 ---
I did the test suggested, the results was as follows

A. gcc-8.2.0
B. gcc-9.1.0
C. gcc-9.1.0 -fno-jump-tables
D. gcc-9.1.0 patched "max_ratio_for_size = 2"

Overall CSiBE was

A: 2 413 510 bytes
B: 2 417 915 bytes (+4405 bytes, +0.18%)
C: 2 423 413 bytes (+9903 bytes, +0.41%)
D: 2 417 739 bytes (+4229 bytes, +0.18%)

Example file CSiBE "vsprintf.c" was

A: 2369 bytes
B: 2589 bytes (+220 bytes, +9.3%)
C: 2445 bytes ( +76 bytes, +3.2%)
D: 2489 bytes (+120 bytes, +5.1%)

So it didn't really solve it, but made it better possibly,
though there are othre code size regression aswell for gcc-9.1.0 which might
also affect results..

[Bug middle-end/90340] Not optimal code when compiling switch-case for size, code increase +35%

2019-05-06 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90340

--- Comment #8 from Fredrik Hederstierna 
 ---
Ok, thannks, I will try to have a look at it later tonight (I'm at my regular
job now ;-)
Thanks/Fredrik

[Bug middle-end/90340] Not optimal code when compiling switch-case for size, code increase +35%

2019-05-06 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90340

--- Comment #5 from Fredrik Hederstierna 
 ---
I use patched sources from
http://gcc.hederstierna.com/csibe

I think you could download and try it out.
Toolchain I build with
https://github.com/fredrikhederstierna/buildbuddy

Otherwise I think there are at least two more code size regressions open for
gcc-9.1.0 on ARM against CSiBE, so maybe there are already some CSiBE test
setup running that could be quicker path forward.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90255

Code size regressions on CSiBE for gcc-9.1.0 I think is worked on.
Thanks/Fredrik

[Bug target/90340] Not optimal code when compiling C library for vsnprintf, code size increase 17%

2019-05-05 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90340

--- Comment #2 from Fredrik Hederstierna 
 ---
Created attachment 46297
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46297=edit
Stripped down testcase, gives +35% code size increase

Stripped down testcase, gives +35% code size increase.

[Bug target/90340] Not optimal code when compiling C library for vsnprintf, code size increase 17%

2019-05-05 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90340

--- Comment #1 from Fredrik Hederstierna 
 ---
Stripped down the testcase some, this version gives +35% code size increase.
With gcc-8.2.0 it was 536 bytes, when gcc-9.1.0 gives 724 bytes (+188 bytes).

[Bug c/90340] New: Not optimal code when compiling C library for vsnprintf, code size increase 17%

2019-05-03 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90340

Bug ID: 90340
   Summary: Not optimal code when compiling C library for
vsnprintf, code size increase 17%
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fredrik.hederstie...@securitas-direct.com
  Target Milestone: ---

Created attachment 46284
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46284=edit
Testcase with example code  from Linux C library

When compiling some old Linux kernel library code for vsnprintf, the code
generated seems not optimal, and code size increased almost 17% for Cortex.M.

This was starting with gcc-9.1.0, previous versions did not show this.

(Generally when testing against CSiBE the overall average code size increased
with gcc-9.1.0 compared to previous version for the first time since gcc-4.6.0)
http://gcc.hederstierna.com/csibe/

Attached stripped example file from Linux library.
Compliled with -Os (makefile attached)

Gcc-8.2.0 generated more compact code size 806 bytes,
Gcc-9.1.0 generated some large switch table code size 942 bytes.
Difference is +136 bytes (+16.9%).

[Bug middle-end/85880] Different code generation for uninitialized variables

2018-05-22 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85880

--- Comment #4 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Ok, you are probably right. I just was surprised that all GCC 4,5,6,7 gave same
result, but something changed with 8. But you are right, its unpredictable
results since its undefined. I practice it gave me some issues though to some
benchmarking suites have code that have undefined variabled, but thats an issue
for benchmarks, not GCC itself.
So you are right, me can close it, just wanted it to be confirmed as
unimportant.
Thanks, Fredrik

[Bug middle-end/85880] Different code generation for uninitialized variables

2018-05-22 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85880

--- Comment #2 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
All old GCC < 8
---

Disassembly of section .text:

 :
   0:   2000movsr0, #0
   2:   4770bx  lr

0004 :
   4:   b500push{lr}
   6:   2000movsr0, #0
   8:   bd00pop {pc}

000a :
   a:   b500push{lr}
   c:   2000movsr0, #0
   e:   bd00pop {pc}

--

From GCC 8.1.0
---

Disassembly of section .text:

 :
   0:   2000movsr0, #0
   2:   4770bx  lr

0004 :
   4:   b500push{lr}
   6:   2000movsr0, #0
   8:   bd00pop {pc}

000a :
   a:   2380movsr3, #128; 0x80
   c:   b510push{r4, lr}
   e:   005blslsr3, r3, #1
  10:   2000movsr0, #0
  12:   2b00cmp r3, #0
  14:   d101bne.n   1a <test_2+0x10>
  16:   f7ff fffe   bl  0 
  1a:   bd10pop {r4, pc}

[Bug c/85880] Different code generation for uninitialized variables

2018-05-22 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85880

--- Comment #1 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Created attachment 44165
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44165=edit
Makefile

[Bug c/85880] New: Different code generation for uninitialized variables

2018-05-22 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85880

Bug ID: 85880
   Summary: Different code generation for uninitialized variables
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fredrik.hederstie...@securitas-direct.com
  Target Milestone: ---

Created attachment 44164
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44164=edit
Example file

Starting with GCC-8.1.0 the code generation for unitialized variables seems to
be changed.

This is not necessarily a bug perhaps, but became a problem for me since the
CSiBE code size benchmark have some files in the Linux code that have
unitialized variables.

When compiling CSiBE with Gcc-8.1.0 suddenly some files increased remarkably in
size, example is the file "capability.c" which doubled in size many times.

Though this might be a CSiBE issue as root cause, but just wanted to confirm
that this is as expected and maybe it can have impact on other benchmarks that
includes uninitialized variables.

See attached stripped down case taken from CSiBE "capability.c"

[Bug rtl-optimization/81625] GCC v4.7 ... v8 is bloating code by > 25% compared to v3.4

2017-08-07 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81625

--- Comment #5 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
I tried build several AVR toolchains from 3.4.6 to 7.1.0 and I can confirm that
code size increases as described. I suspect for AVR this might start already
from 3.x -> 4.x

Checked Bug 17549 - [4.0 Regression] 10% increase in codesize with C code
compared to GCC 3.3: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549

If TER pass is disabled adding "-fno-tree-ter", then results get more than -10%
smaller. Though results still gets +10% worse than 3.4.6 even with 7.1.0,
though adding -mstrict-X also makes its slightly better too..

[Bug rtl-optimization/81625] GCC v4.7 ... v8 is bloating code by > 25% compared to v3.4

2017-08-01 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81625

Fredrik Hederstierna <fredrik.hederstie...@securitas-direct.com> changed:

   What|Removed |Added

 CC||fredrik.hederstierna@securi
   ||tas-direct.com

--- Comment #3 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Checked size of text segment on arm-none-eabi from 4.6 to 7.1 but no major
difference seen, though some increase in later releases.

I previously saw code growt especially on ARM thumb1 code, but seems to be on
track again with newer releases, at least when running CsiBE benchmark.

gcc-4.6.41868 bytes (0)
gcc-4.7.41844 bytes (-1.3%)
gcc-4.8.51832 bytes (-1.9%)
gcc-4.9.31824 bytes (-2.4%)
gcc-5.3.01832 bytes (-1.9%)
gcc-6.3.01856 bytes (-0.6%)
gcc-7.1.01856 bytes (-0.6%)
gcc-8-master 1872 bytes (+0.2%)

arm-none-eabi-gcc -c -Os -std=gnu89 -mcpu=cortex-m3 -mthumb snake.c

See my CSibe benchmark data at http://gcc.hederstierna.com/csibe/
currently only for ARM but my plan was to add more targets after time, but
project halted due now to no time unfortunately.

[Bug c/81524] Bogus or missing warnings when dereferencing pointer to deallocated stack memory

2017-07-25 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81524

--- Comment #3 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Isn't AddressSanitizer checking in run-time? There are several tools that can
find this bugs in runtime I think like Valgrind, but I need to find this at
compile-time.

I use embedded arm-eabi target and cannot add memory for instrumentation with
compile AddressSanitizer, or can AddressSanitizer do jobs alos in compile-time
without adding code size for run-time instrumentation?

[Bug c++/60517] warning/error for taking address of member of a temporary object

2017-07-25 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60517

--- Comment #23 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Ah ok, yes I think you are right. The check could possibly be in "cp/typeck.c"
and "cp/tree.c"? but I'm not familiar with this C++ parsing code.

Interesting that this code gets warning:

  B* foo(A a) {
B *b = &(a.getB());
return b;
  }

test.c: In function ‘B* foo2(A)’:
test.c:28:20: error: taking address of temporary [-fpermissive]
   B *b = &(a.getB());

but this does not (as you example)

  double foo(A a) {
double *x = &(a.getB().x[0]);
return x[0];
  }

though the addressing taken was done implicit when taking addressing of the
array x?

Checking flow in "typeck.c", it does not give warning since the 'kind' of
lvalue is not clk_class but clk_ordinary in latter case since its ARRAY_REF.
The warning-code in typeck.c does check for this, but I saw that recently some
code was made to consider an array to be a 'class' in PR 80415 patch:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80415

Could it be possible to also include more 'kinds' or types in the warning-test
to also include this example?

Though checking this small patch will trigger the warning, but probably its not
the right thing to do, but maybe a starting point to seek for a solution?

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c  
index 2122450..4dab925 100644   
--- a/gcc/cp/tree.c 
+++ b/gcc/cp/tree.c 
@@ -163,7 +163,6 @@ lvalue_kind (const_tree ref)
   /* FALLTHRU */
 case INDIRECT_REF:
 case ARROW_EXPR:
-case ARRAY_REF:
 case ARRAY_NOTATION_REF:
 case PARM_DECL:
 case RESULT_DECL:
@@ -212,6 +211,7 @@ lvalue_kind (const_tree ref)
 case COMPOUND_EXPR:
   return lvalue_kind (TREE_OPERAND (ref, 1));

+case ARRAY_REF:
 case TARGET_EXPR:
   return clk_class;

This will return lvalue 'kind' as clk_class also for arrays, and then the
warning will trigger in typeck.c. Probably this is just nonsense, so I will
leave this to someone else to solve that understands this part of code (not
me).

Still I think other path is also interesting about warning for uninitialized
artificials (that was removed in PR 43347 by unclear reasons - I think the
warning was correct, and the issue was rather that generated SRA name was not
the real variable name, which is a completely other problem related to debug
info I think)...

[Bug c++/60517] warning/error for taking address of member of a temporary object

2017-07-24 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60517

--- Comment #21 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Started with fix for PR 43347 to not warn on artificial aggregates

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43347

[Bug c++/60517] warning/error for taking address of member of a temporary object

2017-07-24 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60517

--- Comment #20 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Simplest fix might be something like?
- else
+ else if (access->grp_no_warning)
so we do not always suppress warnings, but name will look funny for temp.

[Bug c++/60517] warning/error for taking address of member of a temporary object

2017-07-24 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60517

Fredrik Hederstierna <fredrik.hederstie...@securitas-direct.com> changed:

   What|Removed |Added

 CC||fredrik.hederstierna@securi
   ||tas-direct.com

--- Comment #19 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
It is the SRA pass that sets TREE_NO_WARNING in function from "tree-sra.c":
"create_access_replacement (struct access *access)".

Since acccess-base is set to ARTIFICIAL and IGNORED, the else-case is taken,
setting:

  TREE_NO_WARNING (repl) = 1;

If removing this line we will get a warning in "tree-ssa-uninit.c":

test.c:120:14: warning: 'SR.1' is used uninitialized in this function
[-Wuninitialized]
   return x[0];
  ^
Though the generated name is not looking good,
if forcing SRA to make a name it end up with instead:

test.c:120:14: warning: '' is used uninitialized in this function
[-Wuninitialized]
   return x[0];

I'm not sure where to take it from here, suggestions?

[Bug c/81524] New: Bogus or missing warnings when dereferencing pointer to deallocated stack memory

2017-07-23 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81524

Bug ID: 81524
   Summary: Bogus or missing warnings when dereferencing pointer
to deallocated stack memory
   Product: gcc
   Version: 7.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fredrik.hederstie...@securitas-direct.com
  Target Milestone: ---

Created attachment 41814
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41814=edit
test_deref_ptr_to_dealloc_stack_mem.c

When dereferencing a pointer to deallocated stack memory sometimes warnings
messages are missing or gives bogus information.

See attached test example with 6 different cases.

Some cases does give confusing message I think and some are missing.

Tested with GCC 7.1 and flags:
-Wnull-dereference -Wreturn-local-addr -Wuninitialized

Could it be possible to differ between 'null' pointer and 'dangling' pointer?
In pointer-to analysis it might be possible to in flow to see if pointer will
point to deallocated stack frame memory and mark it as 'dangeling'? Now it
seems to if ref missing it assume NULL and give null-pointer warning in some
cases, which might be bogus?

[Bug c/50584] No warning for passing small array to C99 static array declarator

2017-05-16 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50584

Fredrik Hederstierna <fredrik.hederstie...@securitas-direct.com> changed:

   What|Removed |Added

 CC||fredrik.hederstierna@securi
   ||tas-direct.com

--- Comment #14 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Still present in GCC-7.1.0

Simple test code:

-
int s[1] = { 1 };

void test(int v[static 2])
{
  printf("%d\n", v[1]);
  v[1] = 0;
}

int main(void)
{
  test(s);
  return 0;
}

-
No warnings by GCC:

> gcc -c test.c -W -Wall -Wextra -Warray-bounds -O2 -std=c99

-
But with CLANG-3.8.0 we get warnings:

> clang -c test.c

test.c:13:3: warning: array argument is too small; contains 1 elements, callee
  requires at least 2 [-Warray-bounds]
  test(s);
  ^~
test.c:5:15: note: callee declares array parameter as static here
void test(int v[static 2])
  ^~~
1 warning generated.

[Bug tree-optimization/67213] When compiling for size with -Os loops can get bigger after peeling

2017-05-06 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67213

--- Comment #6 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Same thing for x86, not only ARM:


bash# gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609


bash# gcc -c test.c -Os --param max-completely-peel-times=5
bash# objdump -dath test.o


Disassembly of section .text:

000f :
   f:   c6 05 00 00 00 00 00movb   $0x0,0x0(%rip)# 16
<test_iter_6+0x7>
  16:   c6 05 00 00 00 00 01movb   $0x1,0x0(%rip)# 1d
<test_iter_6+0xe>
  1d:   c6 05 00 00 00 00 02movb   $0x2,0x0(%rip)# 24
<test_iter_6+0x15>
  24:   c6 05 00 00 00 00 03movb   $0x3,0x0(%rip)# 2b
<test_iter_6+0x1c>
  2b:   c6 05 00 00 00 00 04movb   $0x4,0x0(%rip)# 32
<test_iter_6+0x23>
  32:   c6 05 00 00 00 00 05movb   $0x5,0x0(%rip)# 39
<test_iter_6+0x2a>
  39:   c3  retq   


bash# gcc -c test.c -Os --param max-completely-peel-times=4
bash# objdump -dath test.o


Disassembly of section .text:

000f :
   f:   31 c0   xor%eax,%eax
  11:   88 80 00 00 00 00   mov%al,0x0(%rax)
  17:   48 ff c0inc%rax
  1a:   48 83 f8 06 cmp$0x6,%rax
  1e:   75 f1   jne11 <test_iter_6+0x2>
  20:   c3  retq

[Bug tree-optimization/67213] When compiling for size with -Os loops can get bigger after peeling

2017-05-06 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67213

--- Comment #5 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Still same in GCC-7.1.0.

I analyzed using -fdump-tree-cunroll-details

void test_iter_6(void)
{
  int i;
  for (i = 0; i < 6; i++) {
data[i] = i;
  }
}

The function was generated "test_iter_6":

001c :
  1c:   e59f3030ldr r3, [pc, #48]   ; 54 <test_iter_6+0x38>
  20:   e3a02000mov r2, #0
  24:   e5c32000strbr2, [r3]
  28:   e3a02001mov r2, #1
  2c:   e5c32001strbr2, [r3, #1]
  30:   e3a02002mov r2, #2
  34:   e5c32002strbr2, [r3, #2]
  38:   e3a02003mov r2, #3
  3c:   e5c32003strbr2, [r3, #3]
  40:   e3a02004mov r2, #4
  44:   e5c32004strbr2, [r3, #4]
  48:   e3a02005mov r2, #5
  4c:   e5c32005strbr2, [r3, #5]
  50:   e12fff1ebx  lr
  54:   .word   0x

With "--param max-completely-peel-times=4" (instead of default 5) it became

001c :
  1c:   e59f2014ldr r2, [pc, #20]   ; 38 <test_iter_6+0x1c>
  20:   e3a03000mov r3, #0
  24:   e7c33002strbr3, [r3, r2]
  28:   e2833001add r3, r3, #1
  2c:   e3530006cmp r3, #6
  30:   1afbbne 24 <test_iter_6+0x8>
  34:   e12fff1ebx  lr
  38:   .word   0x

It seems like "try_unroll_loop_completely()" in file "tree-ssa-loop-ivcanon.c"
think it could fold counting variable, but maybe its not possible since its
used as both index and as RHS value?

;; Function test_iter_6 (test_iter_6, funcdef_no=1, decl_uid=4067,
cgraph_uid=1)

Analyzing # of iterations of loop 1
  exit condition [5, + , 4294967295] != 0
  bounds on difference of bases: -5 ... -5
  result:
# of iterations 5, bounded by 5
Analyzing # of iterations of loop 1
  exit condition [5, + , 4294967295] != 0
  bounds on difference of bases: -5 ... -5
  result:
# of iterations 5, bounded by 5
Statement (exit)if (ivtmp_7 != 0)
 is executed at most 5 (bounded by 5) + 1 times in loop 1.
Induction variable (int) 0 + 1 * iteration does not wrap in statement data[i_9]
= _4;
 in loop 1.
Statement data[i_9] = _4;
 is executed at most 9 (bounded by 9) + 1 times in loop 1.
Induction variable (int) 1 + 1 * iteration does not wrap in statement i_6 = i_9
+ 1;
 in loop 1.
Statement i_6 = i_9 + 1;
 is executed at most 2147483646 (bounded by 2147483646) + 1 times in loop 1.
Loop 1 iterates 5 times.
Loop 1 iterates at most 5 times.
Estimating sizes for loop 1
 BB: 3, after_exit: 0
  size:   0 _4 = (char) i_9;
   Induction variable computation will be folded away.
  size:   1 data[i_9] = _4;
  size:   1 i_6 = i_9 + 1;
   Induction variable computation will be folded away.
  size:   1 ivtmp_7 = ivtmp_1 - 1;
   Induction variable computation will be folded away.
  size:   2 if (ivtmp_7 != 0)
   Exit condition will be eliminated in peeled copies.
 BB: 4, after_exit: 1
size: 5-4, last_iteration: 5-2
  Loop size: 5
  Estimated size after unrolling: 5

Though produced code with peeling become

test_iter_6 ()
{
  int i;
  char _4;
  unsigned int ivtmp_7;
  char _12;
  unsigned int ivtmp_15;
  char _19;
  unsigned int ivtmp_22;
  char _26;
  unsigned int ivtmp_29;
  char _33;
  unsigned int ivtmp_36;
  char _40;
  unsigned int ivtmp_43;

  :
  _12 = 0;
  data[0] = _12;
  i_14 = 1;
  ivtmp_15 = 5;
  _19 = (char) i_14;
  data[i_14] = _19;
  i_21 = i_14 + 1;
  ivtmp_22 = ivtmp_15 + 4294967295;
  _26 = (char) i_21;
  data[i_21] = _26;
  i_28 = i_21 + 1;
  ivtmp_29 = ivtmp_22 + 4294967295;
  _33 = (char) i_28;
  data[i_28] = _33;
  i_35 = i_28 + 1;
  ivtmp_36 = ivtmp_29 + 4294967295;
  _40 = (char) i_35;
  data[i_35] = _40;
  i_42 = i_35 + 1;
  ivtmp_43 = ivtmp_36 + 4294967295;
  _4 = (char) i_42;
  data[i_42] = _4;
  i_6 = i_42 + 1;
  ivtmp_7 = ivtmp_43 + 4294967295;
  return;

}

instead of original and shorter

test_iter_6 ()
{
  int i;
  unsigned int ivtmp_1;
  char _4;
  unsigned int ivtmp_7;

  :

  :
  # i_9 = PHI <i_6(4), 0(2)>
  # ivtmp_1 = PHI <ivtmp_7(4), 6(2)>
  _4 = (char) i_9;
  data[i_9] = _4;
  i_6 = i_9 + 1;
  ivtmp_7 = ivtmp_1 - 1;
  if (ivtmp_7 != 0)
goto ;
  else
goto ;

  :
  goto ;

  :
  return;

}

Could it be that somewhat since that index also is used as data that variable
cannot be folded away like
  size:   1 i_6 = i_9 + 1;
   Induction variable computation will be folded away.

[Bug target/70341] [5/6/7 Regression] cost model for addresses is incorrect, slsr is using reg + reg + CST for arm

2016-09-05 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70341

--- Comment #10 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Can it be related to some missing code hoisting in eg. combiner or gcse?
I found this old PR 11832 on similar issue, can it be related, or have a common
solution?

Bug 11832 - Optimization of common stores in switch statements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11832
BR/Fredrik

[Bug target/70341] [5/6/7 Regression] cost model for addresses is incorrect, slsr is using reg + reg + CST for arm

2016-09-04 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70341

--- Comment #8 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Could it be something in tree-ssa-forwprop pass?

This pass is executed 4 times in -Os, and starting with GCC-4.9 it seems like
statements that seems to generate instructions that are hard to eliminate later
in compilation could be propagated?


Checking 4.8.5 (good result)

;; Function test (test, funcdef_no=0, decl_uid=4075, cgraph_uid=0)

test (struct table_s * table, int xi)
{
  struct item_s * item;
  int _6;
  int _7;
  int _9;
  int _11;
  int _13;

  :
  item_4 = _2(D)->items[xi_3(D)];
  _6 = item_4->type;
  switch (_6) , case 1: , case 2: , case 3: , case 4:
>

:
  _7 = item_4->name;
  handle_case_1 (_7);
  goto  ();

:
  _9 = item_4->name;
  handle_case_2 (_9);
  goto  ();

:
  _11 = item_4->name;
  handle_case_3 (_11);
  goto  ();

:
  _13 = item_4->name;
  handle_case_4 (_13);

:
  return;

}

---

Same code GCC 4.9.3 (bad result)

;; Function test (test, funcdef_no=0, decl_uid=4090, symbol_order=0)

test (struct table_s * table, int xi)
{
  struct item_s * item;
  int _6;
  int _7;
  int _9;
  int _11;
  int _13;

  :
  _6 = MEM[(struct item_s *)table_2(D)].items[xi_3(D)].type;
  switch (_6) , case 1: , case 2: , case 3: , case 4:
>

:
  _7 = MEM[(struct item_s *)table_2(D)].items[xi_3(D)].name;
  handle_case_1 (_7);
  goto  ();

:
  _9 = MEM[(struct item_s *)table_2(D)].items[xi_3(D)].name;
  handle_case_2 (_9);
  goto  ();

:
  _11 = MEM[(struct item_s *)table_2(D)].items[xi_3(D)].name;
  handle_case_3 (_11);
  goto  ();

:
  _13 = MEM[(struct item_s *)table_2(D)].items[xi_3(D)].name;
  handle_case_4 (_13);

:
  return;

}


When comparing the two versions of "tree-ssa-forwprop.c" it seems like
sometimes there are removed checks for invariant code, eg. in
forward_propagate_addr_expr_1() there are no checks for
is_gimple_min_invariant(), but I'm not familiar with this code is it might just
fine, it was just an observation, since it seems like invariant code are
propagated here, but maybe I'm wrong.

When compiling for speed there is less downside in copying statements, its like
loop unrolling or peeling, but when optimizing for size its always bad if later
passes cannot eliminate these extra copied instructions again I think. Maybe
propagation should be more restrictive when optimizing for size?
I did not see any checks for optimization type in forwardpropagation, but maybe
its intended to only do zero-cost propagations?

Thank & BR/Fredrik

[Bug target/70359] [6/7 Regression] Code size increase for ARM compared to gcc-5.3.0

2016-08-28 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

--- Comment #19 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Tested GCC-6.2, still same behavior.

[Bug target/70341] [5/6/7 Regression] cost model for addresses is incorrect, slsr is using reg + reg + CST for arm

2016-08-28 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70341

--- Comment #7 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Tested with GCC 6.2 and still same behaviour.

[Bug target/70359] [6 Regression] Code size increase for ARM compared to gcc-5.3.0

2016-04-04 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

--- Comment #15 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Created attachment 38185
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38185=edit
tok.c

I took another example for CSiBE and stripped it down. I'm not 100% sure it is
the exact same issue, but looks similar so I attached it.

This gives bigger code for -Os -mcpu=arm966e-s -marm, and also for
-mcpu=cortex-m3, though arm7tdmi and cortex-m0 resulted in small code.

gcc 5.3.0:

 :
   0:   e92d4010push{r4, lr}
   4:   e1a04000mov r4, r0
   8:   ebfebl  0 
   c:   e350cmp r0, #0
  10:   0a03beq 24 <test+0x24>
  14:   e59f002cldr r0, [pc, #44]   ; 48 <test+0x48>
  18:   ebfebl  0 
  1c:   e1a4mov r0, r4
  20:   eaf8b   8 <test+0x8>
  24:   e1a4mov r0, r4
  28:   ebfebl  0 
  2c:   e350cmp r0, #0
  30:   0a02beq 40 <test+0x40>
  34:   e59f000cldr r0, [pc, #12]   ; 48 <test+0x48>
  38:   ebfebl  0 
  3c:   eaf8b   24 <test+0x24>
  40:   e1a4mov r0, r4
  44:   e8bd8010pop {r4, pc}
  48:   andeq   r0, r0, r0

master:

 :
   0:   e92d4070push{r4, r5, r6, lr}
   4:   e1a04000mov r4, r0
   8:   ebfebl  0 
   c:   e59f5048ldr r5, [pc, #72]   ; 5c <test+0x5c>
  10:   e350cmp r0, #0
  14:   1a06bne 34 <test+0x34>
  18:   e1a4mov r0, r4
  1c:   ebfebl  0 
  20:   e59f5034ldr r5, [pc, #52]   ; 5c <test+0x5c>
  24:   e350cmp r0, #0
  28:   1a06bne 48 <test+0x48>
  2c:   e1a4mov r0, r4
  30:   e8bd8070pop {r4, r5, r6, pc}
  34:   e1a5mov r0, r5
  38:   ebfebl  0 
  3c:   e1a4mov r0, r4
  40:   ebfebl  0 
  44:   eaf1b   10 <test+0x10>
  48:   e1a5mov r0, r5
  4c:   ebfebl  0 
  50:   e1a4mov r0, r4
  54:   ebfebl  0 
  58:   eaf1b   24 <test+0x24>
  5c:   andeq   r0, r0, r0

[Bug target/70359] [6 Regression] Code size increase for ARM compared to gcc-5.3.0

2016-03-24 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

--- Comment #6 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Thanks for your analysis on this. One comment on this 'complaint', it's not
only about size - in my example the compiler uses 2 more regs push and pop, and
several more instructions, so I think causing performance regressions aswell? I
can file the 'complaints' as performance degradations next time if this is
better.

Actually this was derived from a larger code base analysis (CSiBE)

Bug 61578 - [4.9 regression] Code size increase for ARM thumb compared to 4.8.x
when compiling with -Os

Where CSiBE problems was analysed, but this issue unfortunately got too fuzzy
where its hard to define an issue on almost 1000 files, the conclusion was then
to create separate smaller examples to work on, because the CSiBE overall
benchmark was hard to overview.

I understand its much focus on performance speed on GCC development, but I do
think that size really is important aswell since GCC is used very widely also
for typically ARM based embedded systems.

I will continue to try track size on CSiBE on
http://gcc.hederstierna.com/csibe, but please comment if you think size
regressions are non-wanted, I can try to focus more on issues that have both
performance speed and size components combined (I guess they often go hand in
hand.).

Thanks, BR, Fredrik

[Bug target/70359] New: Code size increase for ARM compared to gcc-5.3.0

2016-03-22 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

Bug ID: 70359
   Summary: Code size increase for ARM compared to gcc-5.3.0
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fredrik.hederstie...@securitas-direct.com
  Target Milestone: ---

Created attachment 38058
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38058=edit
inttostr.c

Code size increase on master for ARM target compared to gcc-5.3.0
Target: arm-none-eabi
Flags: -Os -mcpu=arm966e-s -marm

gcc 5.3.0:

 :
   0:   e3a03000mov r3, #0
   4:   e92d4070push{r4, r5, r6, lr}
   8:   e1a06000mov r6, r0
   c:   e2422001sub r2, r2, #1
  10:   e0205fc0eor r5, r0, r0, asr #31
  14:   e0455fc0sub r5, r5, r0, asr #31
  18:   e0814002add r4, r1, r2
  1c:   e7c13002strbr3, [r1, r2]
  20:   e1a5mov r0, r5
  24:   e3a0100amov r1, #10
  28:   ebfebl  0 <__aeabi_uidivmod>
  2c:   e2811030add r1, r1, #48 ; 0x30
  30:   e5641001strbr1, [r4, #-1]!
  34:   e1a5mov r0, r5
  38:   e3a0100amov r1, #10
  3c:   ebfebl  0 <__aeabi_uidiv>
  40:   e2505000subsr5, r0, #0
  44:   1af5bne 20 <inttostr+0x20>
  48:   e356cmp r6, #0
  4c:   b3a0302dmovlt   r3, #45 ; 0x2d
  50:   b5443001strblt  r3, [r4, #-1]
  54:   b2444001sublt   r4, r4, #1
  58:   e1a4mov r0, r4
  5c:   e8bd8070pop {r4, r5, r6, pc}


gcc-6-20160313 snapshot from master:

 :
   0:   e3a03000mov r3, #0
   4:   e92d41f0push{r4, r5, r6, r7, r8, lr}
   8:   e1a07000mov r7, r0
   c:   e3a0800amov r8, #10
  10:   e2422001sub r2, r2, #1
  14:   e0206fc0eor r6, r0, r0, asr #31
  18:   e0466fc0sub r6, r6, r0, asr #31
  1c:   e0815002add r5, r1, r2
  20:   e7c13002strbr3, [r1, r2]
  24:   e1a6mov r0, r6
  28:   e1a01008mov r1, r8
  2c:   ebfebl  0 <__aeabi_uidivmod>
  30:   e2811030add r1, r1, #48 ; 0x30
  34:   e5451001strbr1, [r5, #-1]
  38:   e1a6mov r0, r6
  3c:   e1a01008mov r1, r8
  40:   ebfebl  0 <__aeabi_uidiv>
  44:   e2506000subsr6, r0, #0
  48:   e2454001sub r4, r5, #1
  4c:   1a05bne 68 <inttostr+0x68>
  50:   e357cmp r7, #0
  54:   b3a0302dmovlt   r3, #45 ; 0x2d
  58:   b5443001strblt  r3, [r4, #-1]
  5c:   b2454002sublt   r4, r5, #2
  60:   e1a4mov r0, r4
  64:   e8bd81f0pop {r4, r5, r6, r7, r8, pc}
  68:   e1a05004mov r5, r4
  6c:   eaecb   24 <inttostr+0x24>

[Bug c/70341] Code size increase on ARM cortex-m3 for switch statements

2016-03-21 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70341

--- Comment #1 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Created attachment 38051
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38051=edit
switch.c

Added additional slightly larger test case that also show problems for
cortex-m0 and thumb1.

[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2016-03-21 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #41 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Ok, I will mark this as resolved if noone object.

I tried the ip-fixed again on master, but the gain was very little, so I do not
think this is any way forward currently.

I created this new separate issue from the example submitted in this issue:

Bug 70341 - Code size increase on ARM cortex-m3 for switch statements

Thanks/Fredrik

[Bug c/70341] New: Code size increase on ARM cortex-m3 for switch statements

2016-03-21 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70341

Bug ID: 70341
   Summary: Code size increase on ARM cortex-m3 for switch
statements
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fredrik.hederstie...@securitas-direct.com
  Target Milestone: ---

Created attachment 38049
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38049=edit
test_switch.c

Starting with GCC-4.9.x suboptimal code is generated for this switch-statement.
Toolchain arm-none-eabi.

extern void handle_case_1(int name);
extern void handle_case_2(int name);
extern void handle_case_3(int name);
extern void handle_case_4(int name);

struct item_s
{
  int index;
  int type;
  int name;
  int data;
};

struct table_s
{
  struct item_s items[1];
};

void test(struct table_s *table, int xi)
{
  struct item_s *item = &(table->items[xi]);
  switch (item->type)
{
case 1:
  handle_case_1(item->name);
  break;
case 2:
  handle_case_2(item->name);
  break;
case 3:
  handle_case_3(item->name);
  break;
case 4:
  handle_case_4(item->name);
  break;
}
}

Compiled with gcc-4.6.x, gcc-4.7.x, gcc-4.8.x:

 :
   0:   eb00 1101   add.w   r1, r0, r1, lsl #4
   4:   684bldr r3, [r1, #4]
   6:   3b01subsr3, #1
   8:   2b03cmp r3, #3
   a:   d80fbhi.n   2c <test+0x2c>
   c:   e8df f003   tbb [pc, r3]
  10:   0b080502bleq201420 <test+0x201420>
  14:   6888ldr r0, [r1, #8]
  16:   f7ff bffe   b.w 0 
  1a:   6888ldr r0, [r1, #8]
  1c:   f7ff bffe   b.w 0 
  20:   6888ldr r0, [r1, #8]
  22:   f7ff bffe   b.w 0 
  26:   6888ldr r0, [r1, #8]
  28:   f7ff bffe   b.w 0 
  2c:   4770bx  lr

Compiled with 4.9.x, 5.3.0, and with current master:

 :
   0:   0109lslsr1, r1, #4
   2:   1843addsr3, r0, r1
   4:   685bldr r3, [r3, #4]
   6:   3b01subsr3, #1
   8:   2b03cmp r3, #3
   a:   d813bhi.n   34 <test+0x34>
   c:   e8df f003   tbb [pc, r3]
  10:   0e0a0602cfmadd32eq  mvax0, mvfx0, mvfx10, mvfx2
  14:   4408add r0, r1
  16:   6880ldr r0, [r0, #8]
  18:   f7ff bffe   b.w 0 
  1c:   4408add r0, r1
  1e:   6880ldr r0, [r0, #8]
  20:   f7ff bffe   b.w 0 
  24:   4408add r0, r1
  26:   6880ldr r0, [r0, #8]
  28:   f7ff bffe   b.w 0 
  2c:   4408add r0, r1
  2e:   6880ldr r0, [r0, #8]
  30:   f7ff bffe   b.w 0 
  34:   4770bx  lr

Flags: -mcpu=cortex-m3
Both -Os and -O2 gives increased code size.

[Bug tree-optimization/67213] When compiling for size with -Os loops can get bigger after peeling

2016-03-21 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67213

--- Comment #4 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
I've investigated this issue some further, and I believe the problem might be
that we allow too many iterations when doing complete peeling of loops on ARM.

The heuristics in "tree-ssa-loop-ivcanon.c" for estimating unrolled cost/size
in "estimated_unrolled_size()" is quite rough, just assuming it will be reduced
in further passes to 2/3? This is not always true and can lead to larger code
size I think after a complete peeling of loops (as in the example in this
issue).

It seems very difficult to estimate the final size of complete peeling, also
across all architectures. I've experimented with 3/4 if optimizing for size,
but it became worse.

One solution that works for me is to set a lower limit for the number of times
the unpeeling may use:

I did this patch and it worked.
(Same thing is done in "spu.c" for SPU architecture when they want small code
size.)

In function "arm_option_override (void)":

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c868490..2ba8244 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
+
+  /* Small loops might be completely unpeeled even at -Os.
+ Try to keep code small.  */
+  if (optimize_function_for_size_p (cfun)
+  && !flag_unroll_loops && !flag_peel_loops)
+maybe_set_param_value (PARAM_MAX_COMPLETELY_PEEL_TIMES, 4,
+  global_options.x_param_values,
+  global_options_set.x_param_values);


I simply override max-completely-peel-times to be 4 instead of default 16, and
this seems to work well.

I tested it with CSiBE benchmark on arm/thumb1/thumb2 and I got shorter code on
all tests, no negative results on any function.

What do you think, is it a okey solution to solve this issue, even though the
long-term best solution would be to be able to estimate cost/size better of
unrolling, but this seems like a much more difficult problem to solve.

/Fredrik

[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2016-03-20 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #39 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Created attachment 38036
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38036=edit
CSiBE results for arm thumb1 and thumb2 code generation for various toolchains.

CSiBE results for arm thumb1 and thumb2 code generation for various toolchains.

[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2016-03-20 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #38 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
I guess this 'meta-bugreport' have become lightly fuzzy with all kinds of CSiBE
code size increase issues,
so maybe better to identify these issues on a more detailed level and create
smaller specific reports?

I've done some approaches, like try posting Bug 67507 and Bug 67213. I think
also the attached source to this issue have some switch-case issue and still
becomes larger.
Though I think its better to post that also in a separate issue.

I did a new benchmark yesterday on code size for

arm9_thumb
arm9_arm
cortex-m0
cortex-m3

using newly built compiler toolchains for

gcc 4.6.4
gcc 4.7.4
gcc 4.8.5
gcc 4.9.3
gcc 5.3.0
gcc 6-20160313 snapshot

in total 4*6=24 builds. See attached Excel file for results.
Also you can check out my pre-alpha site at

http://gcc.hederstierna.com/csibe/

(my hope is to be able to build a up-to-date arm-size-benchmark, but its very
pre-alpha still.)

The results looks good I think. The overall size is getting smaller. So I think
its ok.
Though some files miscompiles into large code, we need to dig into these and
look at the specific files  I think.

There are though a proposed patch also attached in this issue regarding arm
register IP,
that could be used to further analyze why LRA code might increase in specific
cases I think.

Do you still think the ip-patch is valid I can rebase it against git master and
submit patch again in a separate issue?

Best Regards, Fredrik

[Bug target/67366] Poor assembly generation for unaligned memory accesses on ARM v6 & v7 cpus

2015-10-11 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67366

Fredrik Hederstierna <fredrik.hederstie...@securitas-direct.com> changed:

   What|Removed |Added

 CC||fredrik.hederstierna@securi
   ||tas-direct.com

--- Comment #16 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Could this fix also possibly improve:
Bug 67507 - "Code size increase with -Os from GCC 4.8.x to GCC 4.9.x for ARM
thumb1", which also seems related to alignment on ARM targets?
Thanks, Fredrik


[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2015-09-08 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #25 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
I've but this last example in a separate issue:
Bug 67507 - Code size increase with -Os from GCC 4.8.x to GCC 4.9.x for ARM
thumb1.
I've also previously put this one that causes size increase
Bug 67213 - When compiling for size with -Os loops can get bigger after
peeling.
Best Regards, Fredrik


[Bug target/67507] New: Code size increase with -Os from GCC 4.8.x to GCC 4.9.x for ARM thumb1

2015-09-08 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67507

Bug ID: 67507
   Summary: Code size increase with -Os from GCC 4.8.x to GCC
4.9.x for ARM thumb1
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fredrik.hederstie...@securitas-direct.com
  Target Milestone: ---

Created attachment 36308
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36308=edit
Example code

Starting with GCC 4.9.x the code size increase with arm-eabi thumb for attached
example code. It seems related to alignment and is still present in GCC 5.2.0.

Example C Code (see attachment for more and list):
The code does cause possible alignment data abort, but still it should compile
consistent and fine assuming user give aligned data. Example 5 gives
type-punned warning for all compilations, neither of the other examples gives
warnings.

extern void func(int data);
char global_data_unaligned[4];
void test_unaligned_1(void) {
  int *idata = (int*)global_data_unaligned;
  func(*idata);
}

Compiles to GCC 4.8.5 arm-none-eabi cortex-m0  -Os

 :
   0:   b508push{r3, lr}
   2:   4b07ldr r3, [pc, #28]   ; (20 <test_unaligned_1+0x20>)
   4:   7858ldrbr0, [r3, #1]
   6:   781aldrbr2, [r3, #0]
   8:   0200lslsr0, r0, #8
   a:   4310orrsr0, r2
   c:   789aldrbr2, [r3, #2]
   e:   78dbldrbr3, [r3, #3]
  10:   0412lslsr2, r2, #16
  12:   4310orrsr0, r2
  14:   061blslsr3, r3, #24
  16:   4318orrsr0, r3
  18:   f7ff fffe   bl  0 
  1c:   bd08pop {r3, pc}
  1e:   46c0nop ; (mov r8, r8)
  20:   .word   0x

Compiles GCC 5.2.0 arm-none-eabi cortex-m0 -Os,  +4 bytes

 :
   0:   b510push{r4, lr}
   2:   4c08ldr r4, [pc, #32]   ; (24 <test_unaligned_1+0x24>)
   4:   7863ldrbr3, [r4, #1]
   6:   7821ldrbr1, [r4, #0]
   8:   78a0ldrbr0, [r4, #2]
   a:   021blslsr3, r3, #8
   c:   430borrsr3, r1
   e:   0400lslsr0, r0, #16
  10:   001amovsr2, r3  // ???
  12:   0003movsr3, r0  // ???
  14:   78e0ldrbr0, [r4, #3]
  16:   4313orrsr3, r2
  18:   0600lslsr0, r0, #24
  1a:   4318orrsr0, r3
  1c:   f7ff fffe   bl  0 
  20:   bd10pop {r4, pc}
  22:   46c0nop ; (mov r8, r8)
  24:   .word   0x

With GCC 4.8.5 arm-none-eabi cortex-m0  -O2, code gets shorter,
no alignment check when compile for speed?

 :
   0:   b508push{r3, lr}
   2:   4b02ldr r3, [pc, #8]; (c <test_unaligned_1+0xc>)
   4:   6818ldr r0, [r3, #0]
   6:   f7ff fffe   bl  0 
   a:   bd08pop {r3, pc}
   c:   .word   0x

--

Example3 compiled with GCC 4.8.5 arm-none-eabi cortex-m0 -Os

0048 :
  48:   b508push{r3, lr}
  4a:   4b03ldr r3, [pc, #12]   ; (58 <test_unaligned_3+0x10>)
  4c:   2201movsr2, #1
  4e:   4393bicsr3, r2
  50:   6818ldr r0, [r3, #0]
  52:   f7ff fffe   bl  0 
  56:   bd08pop {r3, pc}
  58:   .word   0x

Same code compiled with GCC 5.2.0 arm-none-eabi cortex-m0 -Os

0028 :
  28:   2201movsr2, #1
  2a:   4b05ldr r3, [pc, #20]   ; (40 <test_unaligned_3+0x18>)
  2c:   b510push{r4, lr}
  2e:   4393bicsr3, r2
  30:   8858ldrhr0, [r3, #2]// ?? why ldrh
  32:   881aldrhr2, [r3, #0]// ?? why ldrh
  34:   0400lslsr0, r0, #16
  36:   4310orrsr0, r2
  38:   f7ff fffe   bl  0 
  3c:   bd10pop {r4, pc}
  3e:   46c0nop ; (mov r8, r8)
  40:   .word   0x

Seems to be some issue with assumtions on alignment, causing larger code size.
I checked IRA dump for IRA-coloring/build and some examples seems to assign
more hardregs with new threads code added in GCC 4.9. Though I haven't digged
further into this yet.

Toolchain was build with GNU Build Buddy for arm-none-eabi, softfloat, see
scripts at https://github.com/fredrikhederstierna/buildbuddy

/Fredrik


[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2015-09-02 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #23 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Thanks for your patch, I tried it out, and it solves the small example fine,
the code now is similar to GCC 4.8 for this particular example.

Though when I ran the full CSiBE benchmark, the code size unfortunately grew
approx +150 bytes overall for the full suite. So the patch did not solve the
generic root problem with code size increase unfortunately.

This is strange and I'm thinking of how to continue from here, this issue has
diverged a bit too much (mostly because of my own fault) with several examples
etc. Do you think we should create separate issues for different small examples
that compiles bad perhaps? but on the same time we need to keep track on the
'generic' overall code size issue as eg. CSiBE points out.

Here's is another small example I tested yesterday that also gives unnecessary
moves, both for arm7tdmi, arm966e-s and cortex-m0 tested.

extern void func(int data);
char cdata[4];
void test(void) {
  int *idata = (int*)cdata;
  func(*idata);
}

Compiles with GCC 4.8.5 (cortex-m0):

 :
   0:   b508push{r3, lr}
   2:   4b07ldr r3, [pc, #28]   ; (20 <test+0x20>)
   4:   7858ldrbr0, [r3, #1]
   6:   781aldrbr2, [r3, #0]
   8:   0200lslsr0, r0, #8
   a:   4310orrsr0, r2
   c:   789aldrbr2, [r3, #2]
   e:   78dbldrbr3, [r3, #3]
  10:   0412lslsr2, r2, #16
  12:   4310orrsr0, r2
  14:   061blslsr3, r3, #24
  16:   4318orrsr0, r3
  18:   f7ff fffe   bl  0 
  1c:   bd08pop {r3, pc}
  1e:   46c0nop ; (mov r8, r8)
  20:   .word   0x

With GCC 6 master with latest LRA patch (+4 bytes):

 :
   0:   b510push{r4, lr}
   2:   4c08ldr r4, [pc, #32]   ; (24 <test+0x24>)
   4:   7863ldrbr3, [r4, #1]
   6:   7821ldrbr1, [r4, #0]
   8:   78a0ldrbr0, [r4, #2]
   a:   021blslsr3, r3, #8
   c:   430borrsr3, r1
   e:   0400lslsr0, r0, #16
  10:   001amovsr2, r3   ??? MOVE
  12:   0003movsr3, r0   ??? MOVE
  14:   78e0ldrbr0, [r4, #3]
  16:   4313orrsr3, r2
  18:   0600lslsr0, r0, #24
  1a:   4318orrsr0, r3
  1c:   f7ff fffe   bl  0 
  20:   bd10pop {r4, pc}
  22:   46c0nop ; (mov r8, r8)
  24:   .word   0x

Kind Regards, Fredrik


[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2015-09-01 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #21 from Fredrik Hederstierna 
<fredrik.hederstie...@securitas-direct.com> ---
Great, thanks!

Just a note as you are looking into this,
neither GCC 4.8.x nor GCC 5.2.x produces the optimal code I think for this
case,
isn't it better to load result register r0, instead of go over r3?

GCC 4.8.5 (good version):

isascii:
mov r2, #127
mov r3, #0
cmp r2, r0
adc r3, r3, r3
mov r0, r3
bx  lr

better ??:

isascii:
mov r2, #127
mov r3, #0
cmp r2, r0
+   adc r0, r3, r3   <- put result in R0 directly?
-   adc r3, r3, r3
-   mov r0, r3
bx  lr

This would save one more instruction if I'm thinking right.
BR, Fredrik


[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2015-08-19 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #19 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
I'm not sure why bug 59535 was closed, same problem might still exist, quote:

 Zhenqiang Chen 2014-09-03 06:17:44 UTC
 
 Here is a small case to show lra introduces one more register copy (tested 
 with trunk and 4.9).

int isascii (int c)
{
  return c = 0  c  128;
}
With options: -Os -mthumb -mcpu=cortex-m0, I got

isascii:
mov r3, #0
mov r2, #127
mov r1, r3   //???
cmp r2, r0
adc r1, r1, r3
mov r0, r1
bx  lr

With options: -Os -mthumb -mcpu=cortex-m0 -mno-lra, I got

isascii:
mov r2, #127
mov r3, #0
cmp r2, r0
adc r3, r3, r3
mov r0, r3
bx  lr

Testing 4.8.5 and 5.2.0 still still produces same bigger code in GCC 5.2.
So something adds a register copy in this small case.
/Fredrik


[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2015-08-18 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #18 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 36202
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=36202action=edit
Disasm for -mthumb also, code size increase was +48%.


[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2015-08-18 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #17 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 36201
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=36201action=edit
Simple example giving +50% code size increase compared gcc-4.8.5 and gcc-5.2.0

Simple example giving +50% code size increase compared gcc-4.8.5 and gcc-5.2.0.

We still cannot use GCC 4.9, GCC 5.1 and GCC 5.2 due to code size increase.
The GCC 4.8.x produces the smallest binaries for us, and we still haven't
figured out exactly why.

Attached is an attempt to create a test case where GCC 4.9 and after creates
bad code. Please check it out.

With GCC-5.2.0 we got 55% code size increase compared to GCC-4.8.5.

In parallel I work with the CSiBE benchmark and I hope I can contribute with
some more metrics soon.

Thanks and Best Regards/Fredrik


[Bug tree-optimization/67213] When compiling for size with -Os loops can get bigger after peeling

2015-08-14 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67213

--- Comment #3 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 36186
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=36186action=edit
Dump from tree-ssa-loop-ivcanon.c

In the function iter_6 it seems like it will keep cost 5 when unrolling.
Maybe the weights and costs estimations could be more pessimistic when
optimizing for size? I think functions tree_estimate_loop_size() and
estimated_unrolled_size() uses a rough number guess of 1/3, perhaps it could be
more pessimistic eg. for -Os?


[Bug tree-optimization/67213] New: When compiling for size with -Os loops can get bigger after peeling

2015-08-14 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67213

Bug ID: 67213
   Summary: When compiling for size with -Os loops can get bigger
after peeling
   Product: gcc
   Version: 5.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fredrik.hederstie...@securitas-direct.com
  Target Milestone: ---

When compiling thumb1 code for size with -Os some loops can be larger due to
complete peeling.

Example code:

extern char data[10];

void test_iter_2(void)
{
  int i;
  for (i = 0; i  2; i++) {
data[i] = i;
  }
}

void test_iter_6(void)
{
  int i;
  for (i = 0; i  6; i++) {
data[i] = i;
  }
}

void test_iter_7(void)
{
  int i;
  for (i = 0; i  7; i++) {
data[i] = i;
  }
}

It will compile to

 test_iter_2:
   0:   e3a02000mov r2, #0
   4:   e59f300cldr r3, [pc, #12]   ; 18 test_iter_2+0x18
   8:   e5c32000strbr2, [r3]
   c:   e3a02001mov r2, #1
  10:   e5c32001strbr2, [r3, #1]
  14:   e12fff1ebx  lr
  18:   .word   0x

001c test_iter_6:
  1c:   e3a02000mov r2, #0
  20:   e59f302cldr r3, [pc, #44]   ; 54 test_iter_6+0x38
  24:   e5c32000strbr2, [r3]
  28:   e3a02001mov r2, #1
  2c:   e5c32001strbr2, [r3, #1]
  30:   e3a02002mov r2, #2
  34:   e5c32002strbr2, [r3, #2]
  38:   e3a02003mov r2, #3
  3c:   e5c32003strbr2, [r3, #3]
  40:   e3a02004mov r2, #4
  44:   e5c32004strbr2, [r3, #4]
  48:   e3a02005mov r2, #5
  4c:   e5c32005strbr2, [r3, #5]
  50:   e12fff1ebx  lr
  54:   .word   0x

0058 test_iter_7:
  58:   e3a03000mov r3, #0
  5c:   e59f2010ldr r2, [pc, #16]   ; 74 test_iter_7+0x1c
  60:   e7c33002strbr3, [r3, r2]
  64:   e2833001add r3, r3, #1
  68:   e3530007cmp r3, #7
  6c:   1afbbne 60 test_iter_7+0x8
  70:   e12fff1ebx  lr
  74:   .word   0x

The unrolling of iter_6 seems to be controlled by default:

 --param max-completely-peel-times=5

if changing to

 --param max-completely-peel-times=0

code for iter_6 gets ok, but then iter_2 get larger.

 test_iter_2:
   0:   e3a03000mov r3, #0
   4:   e59f2010ldr r2, [pc, #16]   ; 1c test_iter_2+0x1c
   8:   e7c33002strbr3, [r3, r2]
   c:   e2833001add r3, r3, #1
  10:   e3530002cmp r3, #2
  14:   1afbbne 8 test_iter_2+0x8
  18:   e12fff1ebx  lr
  1c:   .word   0x

0020 test_iter_6:
  20:   e3a03000mov r3, #0
  24:   e59f2010ldr r2, [pc, #16]   ; 3c test_iter_6+0x1c
  28:   e7c33002strbr3, [r3, r2]
  2c:   e2833001add r3, r3, #1
  30:   e3530006cmp r3, #6
  34:   1afbbne 28 test_iter_6+0x8
  38:   e12fff1ebx  lr
  3c:   .word   0x

0040 test_iter_7:
  40:   e3a03000mov r3, #0
  44:   e59f2010ldr r2, [pc, #16]   ; 5c test_iter_7+0x1c
  48:   e7c33002strbr3, [r3, r2]
  4c:   e2833001add r3, r3, #1
  50:   e3530007cmp r3, #7
  54:   1afbbne 48 test_iter_7+0x8
  58:   e12fff1ebx  lr
  5c:   .word   0x

I guess its a trade off between number allowed unrolls and expected code size
growth/decrease. Though it could maybe be detected that code size growth in
this case.

Attach toolchain build script and code.


[Bug tree-optimization/67213] When compiling for size with -Os loops can get bigger after peeling

2015-08-14 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67213

--- Comment #1 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 36185
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=36185action=edit
Example files


[Bug target/61578] [4.9/5 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2015-03-02 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #14 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 34916
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34916action=edit
CSiBE benchmark with gnu89, updates with newer trunk as reference.

I added attachment with new CSiBE measurement from newer trunk,
and now using -std=gnu89 for correctness.

It looks alot better on current trunk, the code size is now smaller than 4.8.x,
so in this case this issue seems at least partly resolved.

Though, still the proposed patch with -mip-fixed on trunk,
still gets approx  -0.2%  reduced code size in average,
which might seem significant. See attached docs.

The CSiBE test also it indicates that LRA might improved specific areas,
where the code size gets worse with IP fixed, which could be investigated
further. Example file are libmspack/test/cabd_md5.c.

So, I'm just wondering if you that are or have been involved with this issue,
thinks the proposed patch is a good idea and worth putting time to make it
proper for commit? I just do not want to put time and effort in this patch if
its not likely to get in, or you think its a bad idea.

Please comment :)  Thanks and Kind Regards Fredrik


[Bug target/61578] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2014-11-02 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #9 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 33866
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33866action=edit
Simple patch to exclude use of ip

Simple patch that make it possible to optionally exclude use of ip for thumb1
when optimizing for size.


[Bug target/61578] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2014-11-02 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #10 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 33867
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33867action=edit
Size benchmark gcc 4.8, gcc 4.9 and trunk.

Added updated CSiBE benchmark for GCC 4.8, GCC 4.9 and trunk.
It's obvious that excluding ip gives shorted code.
Then there is something on trunk that makes some project become very large,
which should be investigated perhaps.


[Bug target/61578] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2014-09-23 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #8 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 33544
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33544action=edit
CSiBE test results size

Attached some tests with CSiBE v2.1.1 for arm-eabi.
It seems like the results are very scattered,
sometimes GCC 4.8.3 produces smaller code than GCC 4.9.1,
but on other files it seems to be vice versa.
/Fredrik


[Bug rtl-optimization/59535] [4.9 regression] -Os code size regressions for Thumb1/Thumb2 with LRA

2014-09-03 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535

--- Comment #21 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
I filed this previously, maybe its duplicate

Bug 61578 - Code size increase for ARM thumb compared to 4.8.x when compiling
with -Os

BR Fredrik


[Bug c/61578] New: Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2014-06-21 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

Bug ID: 61578
   Summary: Code size increase for ARM thumb compared to 4.8.x
when compiling with -Os
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fredrik.hederstie...@securitas-direct.com

We have problems when trying to switch from GCC 4.8.3 to GCC 4.9.0 (arm-eabi
thumb1 for arm966e-s core) due to significant code size increase (1-2%).

I attach our toolchain build scripts and two small example programs, though I'm
not fully convinced that these examples fully catch the issue.
Although the examples show increase +4.81% and +2% growth, though the examples
are slightly constructed.

First I suspected it was related to Bug 59535, since the code size decrease
significantly when compiling without LRA (-mno-lra), though still it does not
make the full picture, since still the codes size is ~1% bigger then GCC 4.8.3.

We also tried with and without LTO, but it does not solve our problem either.
See attached example.
Build with 'make' and compare with 'make compare'.

Thanks and Best Regards,
/Fredrik


[Bug c/61578] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2014-06-21 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #1 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 32980
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32980action=edit
Toolchain build script.


[Bug c/61578] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2014-06-21 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #2 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 32981
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32981action=edit
Toolchain build script 4.9.0


[Bug c/61578] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2014-06-21 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #3 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 32982
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32982action=edit
Makefile for examples.


[Bug c/61578] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2014-06-21 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #4 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 32983
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32983action=edit
Example source 1.


[Bug c/61578] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2014-06-21 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #5 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
Created attachment 32984
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32984action=edit
Example source 2.


[Bug target/45207] The -Os flag generates wrong code for ARM966e-s

2014-06-21 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45207

Fredrik Hederstierna fredrik.hederstie...@securitas-direct.com changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #9 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
It seems like functions are not always 4-aligned, even if -falign-functions=4
is passed, but I see no problem with this currently. I was running thumb
not-interworking code, and wrote custom ISR handler using ARM-code trampoline
function, but I found a work-around.


[Bug rtl-optimization/59535] [4.9 regression] -Os code size regressions for Thumb1/Thumb2 with LRA

2014-06-12 Thread fredrik.hederstie...@securitas-direct.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535

Fredrik Hederstierna fredrik.hederstie...@securitas-direct.com changed:

   What|Removed |Added

 CC||fredrik.hederstierna@securi
   ||tas-direct.com

--- Comment #18 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com ---
I compared GCC 4.8.3 and GCC 4.9.0 for arm-none-eabi, and I still see a code
size increase for thumb1 (and thumb2) for both my arm966e and my cortex-m4
targets.

GCC 4.8.3
  RAM used 93812 
  Flash used   515968

GCC 4.9.0
  RAM used 93812 (same)
  Flash used   522608 (+1.3%)

Then I tried to disable LRA and results got better:

GCC 4.9.0 : added flag -mno-lra
  RAM used 93812 (same)
  Flash used   519624 (+0.7%)

Flags used are otherwise identical for both tests:

  -Os -g3 -ggdb3 -gdwarf-4
  -fvar-tracking-assignments -fverbose-asm -fno-common -ffunction-sections  
-fdata-sections -fno-exceptions -fno-asynchronous-unwind-tables
-fno-unwind-tables
  -mthumb -mcpu=arm966e-s -msoft-float -mno-unaligned-access

Generally GCC 4.9.0 seems to produce larger code, I tried to experiement with
LTO (-flto -flto-fat-objects), but then code size increased even more for both
GCC 4.8.3 and GCC 4.9.0, I was expecting a code decrease though.

Sorry I cannot share exact sources used for compilation here, I can share
toolchain build script though on request, or try to setup a small test case.

I first just wanted to confirm that this bug really is fixed and resolved, so
its not a new bug or another known issue.

BR /Fredrik


[Bug c/52923] New: Warn if making external references to local stack memory

2012-04-10 Thread fredrik.hederstie...@securitas-direct.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52923

 Bug #: 52923
   Summary: Warn if making external references to local stack
memory
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: fredrik.hederstie...@securitas-direct.com


Created attachment 27123
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27123
Example code with functions returning with stack memory refs bugs.

GCC does warn if returning a pointer to a local variable (stack memory).
But there are alot of more cases where GCC could possibly warn,
eg. when references are made to local variables or stack memory.

See this attached example code.
GCC warns for first case, but not the others.
I think all cases can be considered program bugs,
and could trigger a compiler warning I think.

I've found out that the present warning is done in c-typeck.c,
is this the right place to but additional warnings of this kind too?

Thanks  Best Regards
Fredrik Hederstierna

The example code file was compiled with -O2 -W -Wall -Wextra
for enabling as many warnings as possible.


[Bug c/52923] Warn if making external references to local stack memory

2012-04-10 Thread fredrik.hederstie...@securitas-direct.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52923

--- Comment #4 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com 2012-04-10 12:52:36 UTC ---
Maybe it have advantages to have a pointer-deref analysis pass rather than a
point-to analysis pass. Then GCC could warn only if the pointer is being
dereferenced for real, this to avoid false positives. But in case of shared
library-code etc, I guess we never know what users/callers will do with the
pointer...

Could there possibly be a connection to the work I think maybe Jeff Law and
others maybe are doing will null-deref checking pass? I guess they already do
some flow analysis and then checking for null-deref rather than
'dangeling-mem-deref' in this case (eg. stack local mem, or free()d-mem).

(I think this is done in PR16351.)

I also seen the __attribute__((nonnull)) with -Wnonnull, could it be possible
perhaps to have some __attribute__((nonlocal)) or similar when declaring
pointer?
/Fredrik


[Bug c/50997] ARM: No warnings for unreachable code for ARM cross compiler

2011-11-07 Thread fredrik.hederstie...@securitas-direct.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50997

--- Comment #3 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com 2011-11-07 21:11:13 UTC ---
Ok, answer to myself. I found the patch:

http://gcc.gnu.org/ml/gcc-patches/2009-11/msg00169.html

I think the patch is very unfortunate, if I knew I should have objected.
It's much better to keep warnings for -O0 level only.
I'm thinking of suggesting a patch to restore this warning.
/Fredrik


[Bug c/50997] New: ARM: No warnings for unreachable code for ARM cross compiler

2011-11-06 Thread fredrik.hederstie...@securitas-direct.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50997

 Bug #: 50997
   Summary: ARM: No warnings for unreachable code for ARM cross
compiler
Classification: Unclassified
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: fredrik.hederstie...@securitas-direct.com


Created attachment 25729
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25729
Example file with warnings for unreachable code

When I compile for x86 the -Wunreachable-code works as intended, but when using
ARM cross compiler I do not get any warnings.

Attached is example C-file, compiled with (tested -O0,-O1,-O2,-O3):

  gcc -c -O2 unreachable.c -W -Wall -Wextra -Wunreachable-code

For x86 GCC I get output:

unreachable.c: In function ‘unreachable’:
unreachable.c:7: warning: ignoring return value of ‘scanf’, declared with
attribute warn_unused_result
unreachable.c:41: warning: will never be executed
unreachable.c:46: warning: will never be executed
unreachable.c: In function ‘main’:
unreachable.c:22: warning: will never be executed
unreachable.c:33: warning: will never be executed
unreachable.c:37: warning: will never be executed

Which is correct

For ARM-cross compiler (4.5.1 and 4.6.0 tested) I do not get any warnings at
all.

ARM-cross compiler was compiled with

configure --enable-languages=c,c++ --target=$TARGET --prefix=$DEST
--with-gnu-as --with-gnu-ld --disable-nls --with-newlib --disable-__cxa_atexit
--with-ecos

make LDFLAGS=-s all all-gcc all-target-libstdc++-v3 install install-gcc
install-target-libstdc++-v3

Fredrik Hederstierna
Securitas Direct AB
Malmoe Sweden


[Bug c/50997] ARM: No warnings for unreachable code for ARM cross compiler

2011-11-06 Thread fredrik.hederstie...@securitas-direct.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50997

--- Comment #2 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com 2011-11-07 07:33:19 UTC ---
Ok, I didn't know that checks for unreachable-code was removed.

Though I would like to know about the background/discussion behind this
decision - do anyone have a link or reference to mailing-list conversation?

Currently I'm looking into adding a static code analysis pass into GCC, and was
thinking of running VRP pass also in -O0 by tweaking VRP and calling workers
directly from static analysis pass. Then I was hoping the have GCC support for
eg. unreachable code warnings. Now it seems I have to implement this by myself?

Thanks!/Fredrik


[Bug c/46765] New: Superfluous 'const' declaration does not generate error or warning

2010-12-02 Thread fredrik.hederstie...@securitas-direct.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46765

   Summary: Superfluous 'const' declaration does not generate
error or warning
   Product: gcc
   Version: 4.5.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: fredrik.hederstie...@securitas-direct.com


Assume we want to declare a pointer to constant data.
This can be done by e.g.

  int const *   ptrToConst1;

But C/C++ also accepts:

  //identical to: int const *   ptrToConst,
  const int *   ptrToConst2;

But GCC also accept a double-const declation:

  //identical to ??: const int *   ptrToConst2;
  //superfluous const? No warning nor error.
  const int const * ptrToConst3;

The superfluous 'const' declaration does not generate error or warning.
It's obvious that the programmer most likely wanted to declare
constant pointer to constant data, but he only gets
pointer to constant data. He should get a warning or parse error for this.
It's no meaning to declare 'constant data' twice.
The second const-declaration does not have any effect.

I'm compiling with GCC-4.5.1 and having all possible warning flags:
-W -Wall -Wextra.

I also attach small example file with test of various const-declarations.


[Bug c/46765] Superfluous 'const' declaration does not generate error or warning

2010-12-02 Thread fredrik.hederstie...@securitas-direct.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46765

--- Comment #1 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com 2010-12-02 13:34:26 UTC ---
Created attachment 22602
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=22602
example file for const.


[Bug c/46766] New: Type 'void' is treated differently if used as return value or as parameter

2010-12-02 Thread fredrik.hederstie...@securitas-direct.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46766

   Summary: Type 'void' is treated differently if used as return
value or as parameter
   Product: gcc
   Version: 4.5.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: fredrik.hederstie...@securitas-direct.com


It is valid to return a void-function, or cast a variable to void, from a
void-function.

This makes some sense, in particular in C++ since we might have a template, and
we would like to put 'void' as type in this C++ template.

But then maybe it should also be allowed to put 'void' as inparameter to a
void-function, but then compiler warns about too many arguments.

void.c: In function ‘main’:
void.c:16: error: too many arguments to function ‘f1’
void.c:17: error: too many arguments to function ‘f2’

Somehow it would be more 'aligned' to have function-return-values and
function-in-parameters work the same way, so that template-alike-constructions,
or similar pure C macro/preprocessor constructions, could work the same
perhaps?

void f1(void)
{
  return (void)0; //OK
}

void f2(void)
{
  return f1(); //OK
}

int main(void)
{
  f1();//OK
  f2();//OK
  f1((void)0); //ERROR
  f2(f1());//ERROR
  return 0;   
}


[Bug c/46766] Type 'void' is treated differently if used as return value or as parameter

2010-12-02 Thread fredrik.hederstie...@securitas-direct.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46766

--- Comment #2 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com 2010-12-02 14:42:35 UTC ---
Ok, but also f1() declares that it does not return any parameters, still it can
return (void)0;

I'm not saying either is wrong, I just though it should be consistent.

If its ok to _return_ (void) from a function, why is it not ok to have (void)
as _inparameter_ to a function. Where is the logical difference?

/Fredrik


[Bug c/46766] Type 'void' is treated differently if used as return value or as parameter

2010-12-02 Thread fredrik.hederstie...@securitas-direct.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46766

--- Comment #4 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com 2010-12-02 15:55:05 UTC ---
Yes, I agree its EURGH.
I guess its not preferred to make C++ template-alike code in C.
I think its worth avoid stuff like:

#ifdef BLAH_BLAH_BLAH
  #define RETURN_TYPE_C_TEMPLATE  void
  #define PARAM_TYPE_C_TEMPLATE   void
  #define PARAM_TYPE_C_TEMPLATE_VARIABLE
#else
  #define RETURN_TYPE_C_TEMPLATE  int
  #define PARAM_TYPE_C_TEMPLATE   int
  #define PARAM_TYPE_C_TEMPLATE_VARIABLE  var
#endif

RETURN_TYPE_C_TEMPLATE func( PARAM_TYPE_C_TEMPLATE 
PARAM_TYPE_C_TEMPLATE_VARIABLE )
{
  return (RETURN_TYPE_C_TEMPLATE) template_func1(
PARAM_TYPE_C_TEMPLATE_VARIABLE );
}

int main(void)
{
  func((PARAM_TYPE_C_TEMPLATE) template_func2());  
  return ERR_EURGH;
}


[Bug c/46483] New: Built-in memcpy() does not handle unaligned int for ARM

2010-11-15 Thread fredrik.hederstie...@securitas-direct.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46483

   Summary: Built-in memcpy() does not handle unaligned int for
ARM
   Product: gcc
   Version: 4.5.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: fredrik.hederstie...@securitas-direct.com


I had some problems with memcpy() when copying unaligned integers for ARM.

For intel target it worked fine, and I concluded that if I overloaded gcclib
weak memcpy-reference I had the same problem, but if I just overloaded memcpy
with a #define to my own function, it worked fine.

At first glance I though it was just a possible strict-aliasing violation, but
when having a closer look I'm not so sure, so I submit this to Bugzilla anyway
for validation.

I submit some lines of source-code that gives an example of the problem.

--
#include stdio.h
--
struct unaligned_int {
  char dummy1;
  char dummy2;
  char dummy3;
  unsigned int number;
} __attribute__((packed));
//-
void *my_memcpy(void *dst, const void *src, size_t n)
{
  char *s = (char*)src;
  char *d = (char*)dst;
  while (n--) {
*d++ = *s++;
  }
  return dst;
}
//-
static void copy_x_into_struct(char *buf, unsigned int x)
{
  unsigned int i;
  struct unaligned_int* testp = (struct unaligned_int*)buf;

  memset(buf, 0xFF, sizeof(unsigned int));
  memcpy((void*)(testp-number), x, sizeof(unsigned int));

  printf(BUILT-IN MEMCPY TO %08x: , (testp-number));
  for (i = 0; i  sizeof(struct unaligned_int); i++) {
printf( %02x, buf[i]);
  }
  printf(\n);

  memset(buf, 0xFF, sizeof(unsigned int));
  my_memcpy((void*)(testp-number), x, sizeof(int));

  printf(USER-DEF MEMCPY TO %08x: , (testp-number));
  for (i = 0; i  sizeof(struct unaligned_int); i++) {
printf( %02x, buf[i]);
  }
  printf(\n);
}
//-
int main(void)
{
  char buf[100];
  unsigned int x = 0x12345678;
  copy_x_into_struct(buf, x);
  return 0;
}

-

PROGRAM OUTPUT:

BUILT-IN MEMCPY TO 04016bd7:  78 56 34 12 00 00 00
USER-DEF MEMCPY TO 04016bd7:  ff ff ff 78 56 34 12

-

GCC-COMMAND-LINE:

arm-elf-gcc -g3 -ggdb3 -gdwarf-2 -mthumb -c -Wall -W -Wextra
-Wno-unused-parameter -mcpu=arm966e-s -Os -fno-omit-frame-pointer -fno-web
-mhard-float -mfpu=fpa -ffunction-sections -fdata-sections 
-o test.o test.c

-

TOOLCHAIN:  Attaching toolchain build-script.


[Bug c/46483] Built-in memcpy() does not handle unaligned int for ARM

2010-11-15 Thread fredrik.hederstie...@securitas-direct.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46483

--- Comment #1 from Fredrik Hederstierna 
fredrik.hederstie...@securitas-direct.com 2010-11-15 14:42:48 UTC ---
Created attachment 22399
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=22399
Script to build arm-elf toolchain