disabling loop unrolling in GCC

2009-12-21 Thread Shachar Shemesh

Hi all,

I'm trying, without success, to disable loop unrolling when compiling a 
program with -O3 with gcc (4.4, but I see the same problem with 4.3).


The program is the following one:

volatile int v;

void func()
{
int i;

for( i=0; i8; ++i ) {
v=0;
}
}

I compile it with the following command line:

gcc -c -O3 test.c

An objdump -S test.o gives:

test.o: file format elf64-x86-64

Disassembly of section .text:

 func:
   0:   c7 05 00 00 00 00 00movl   $0x0,0x0(%rip)# a 
func+0xa

   7:   00 00 00
   a:   c7 05 00 00 00 00 00movl   $0x0,0x0(%rip)# 14 
func+0x14

  11:   00 00 00
  14:   c7 05 00 00 00 00 00movl   $0x0,0x0(%rip)# 1e 
func+0x1e

  1b:   00 00 00
  1e:   c7 05 00 00 00 00 00movl   $0x0,0x0(%rip)# 28 
func+0x28

  25:   00 00 00
  28:   c7 05 00 00 00 00 00movl   $0x0,0x0(%rip)# 32 
func+0x32

  2f:   00 00 00
  32:   c7 05 00 00 00 00 00movl   $0x0,0x0(%rip)# 3c 
func+0x3c

  39:   00 00 00
  3c:   c7 05 00 00 00 00 00movl   $0x0,0x0(%rip)# 46 
func+0x46

  43:   00 00 00
  46:   c7 05 00 00 00 00 00movl   $0x0,0x0(%rip)# 50 
func+0x50

  4d:   00 00 00
  50:   c3  retq

If I compile with -O2, the results are:

test.o: file format elf64-x86-64

Disassembly of section .text:

 func:
   0:   31 c0   xor%eax,%eax
   2:   66 0f 1f 44 00 00   nopw   0x0(%rax,%rax,1)
   8:   83 c0 01add$0x1,%eax
   b:   c7 05 00 00 00 00 00movl   $0x0,0x0(%rip)# 15 
func+0x15

  12:   00 00 00
  15:   83 f8 08cmp$0x8,%eax
  18:   75 ee   jne8 func+0x8
  1a:   f3 c3   repz retq
Where it gets worrying is when I try to cancel loop unrolling. I tried 
-fno-unroll-loops and -fno-peel-loops, to no effect. I even tried 
messing with the --param option (max-unrolled-insns, max-unroll-times, 
max-peel-times) to no noticeable effect.


Even more worryingly, the documentation seems totally wrong. It claims 
(http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Optimize-Options.html#index-O3-632) 
that -O3 is equal to -O2 plus -finline-functions, -funswitch-loops, 
-fpredictive-commoning, -fgcse-after-reload and -ftree-vectorize. Trying 
to compile with -O2 and the additional optimization options does not, 
however, unroll the loop, which suggests that -O3 differs from -O2 in 
another way as well.


Help?

Shachar

--
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: disabling loop unrolling in GCC

2009-12-21 Thread Aviv Greenberg
Just out of curiousity: why do you care about the resulting assembly?
It's a strong indication that you are doing something wrong :)

I would try to set i to volatile or to an extern to trick the compiler
to drop the optimization (if the flags don't work).

--Aviv

2009/12/21 Shachar Shemesh shac...@shemesh.biz:
 Hi all,

 I'm trying, without success, to disable loop unrolling when compiling a
 program with -O3 with gcc (4.4, but I see the same problem with 4.3).

 The program is the following one:

 volatile int v;

 void func()
 {
     int i;

     for( i=0; i8; ++i ) {
     v=0;
     }
 }

 I compile it with the following command line:

 gcc -c -O3 test.c

 An objdump -S test.o gives:

 test.o: file format elf64-x86-64

 Disassembly of section .text:

  func:
    0:   c7 05 00 00 00 00 00    movl   $0x0,0x0(%rip)    # a func+0xa
    7:   00 00 00
    a:   c7 05 00 00 00 00 00    movl   $0x0,0x0(%rip)    # 14
 func+0x14
   11:   00 00 00
   14:   c7 05 00 00 00 00 00    movl   $0x0,0x0(%rip)    # 1e
 func+0x1e
   1b:   00 00 00
   1e:   c7 05 00 00 00 00 00    movl   $0x0,0x0(%rip)    # 28
 func+0x28
   25:   00 00 00
   28:   c7 05 00 00 00 00 00    movl   $0x0,0x0(%rip)    # 32
 func+0x32
   2f:   00 00 00
   32:   c7 05 00 00 00 00 00    movl   $0x0,0x0(%rip)    # 3c
 func+0x3c
   39:   00 00 00
   3c:   c7 05 00 00 00 00 00    movl   $0x0,0x0(%rip)    # 46
 func+0x46
   43:   00 00 00
   46:   c7 05 00 00 00 00 00    movl   $0x0,0x0(%rip)    # 50
 func+0x50
   4d:   00 00 00
   50:   c3  retq

 If I compile with -O2, the results are:

 test.o: file format elf64-x86-64

 Disassembly of section .text:

  func:
    0:   31 c0   xor    %eax,%eax
    2:   66 0f 1f 44 00 00   nopw   0x0(%rax,%rax,1)
    8:   83 c0 01    add    $0x1,%eax
    b:   c7 05 00 00 00 00 00    movl   $0x0,0x0(%rip)    # 15
 func+0x15
   12:   00 00 00
   15:   83 f8 08    cmp    $0x8,%eax
   18:   75 ee   jne    8 func+0x8
   1a:   f3 c3   repz retq

 Where it gets worrying is when I try to cancel loop unrolling. I tried
 -fno-unroll-loops and -fno-peel-loops, to no effect. I even tried
 messing with the --param option (max-unrolled-insns, max-unroll-times,
 max-peel-times) to no noticeable effect.

 Even more worryingly, the documentation seems totally wrong. It claims
 (http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Optimize-Options.html#index-O3-632)
 that -O3 is equal to -O2 plus -finline-functions, -funswitch-loops,
 -fpredictive-commoning, -fgcse-after-reload and -ftree-vectorize. Trying to
 compile with -O2 and the additional optimization options does not, however,
 unroll the loop, which suggests that -O3 differs from -O2 in another way as
 well.

 Help?

 Shachar

 --
 Shachar Shemesh
 Lingnu Open Source Consulting Ltd.
 http://www.lingnu.com

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il





-- 

Joan Crawford  - I, Joan Crawford, I believe in the dollar.
Everything I earn, I spend. -
http://www.brainyquote.com/quotes/authors/j/joan_crawford.html

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: disabling loop unrolling in GCC

2009-12-21 Thread Shachar Shemesh

Aviv Greenberg wrote:

Just out of curiousity: why do you care about the resulting assembly?
It's a strong indication that you are doing something wrong :)
  
First, we have found several bugs in GCC as a result of caring about 
the assembly. Lets agree that it's an indication that someone is doing 
something wrong.


The reason I'm trying to disable this optimization is because it causes 
the code to be too big to fit onto the available ROM on which the code 
needs to be flashed. The X86 version I gave here shows the problem, but 
is no the platform on which the problem was diagnosed.


Shachar

--
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: disabling loop unrolling in GCC

2009-12-21 Thread Aviv Greenberg
This is what i get if i set i to be volatile in gcc 4.3.1 with -O3:

 0:   55  push   %ebp
   1:   89 e5   mov%esp,%ebp
   3:   83 ec 10sub$0x10,%esp
   6:   c7 45 fc 00 00 00 00movl   $0x0,-0x4(%ebp)
   d:   8b 45 fcmov-0x4(%ebp),%eax
  10:   83 f8 07cmp$0x7,%eax
  13:   7f 1e   jg 33 func+0x33
  15:   8d 76 00lea0x0(%esi),%esi
  18:   c7 05 00 00 00 00 00movl   $0x0,0x0
  1f:   00 00 00
  22:   8b 45 fcmov-0x4(%ebp),%eax
  25:   83 c0 01add$0x1,%eax
  28:   89 45 fcmov%eax,-0x4(%ebp)
  2b:   8b 45 fcmov-0x4(%ebp),%eax
  2e:   83 f8 07cmp$0x7,%eax
  31:   7e e5   jle18 func+0x18
  33:   c9  leave
  34:   c3  ret

looks like i was right!

On Mon, Dec 21, 2009 at 15:54, Shachar Shemesh shac...@shemesh.biz wrote:
 Aviv Greenberg wrote:

 Just out of curiousity: why do you care about the resulting assembly?
 It's a strong indication that you are doing something wrong :)


 First, we have found several bugs in GCC as a result of caring about the
 assembly. Lets agree that it's an indication that someone is doing
 something wrong.

 The reason I'm trying to disable this optimization is because it causes the
 code to be too big to fit onto the available ROM on which the code needs to
 be flashed. The X86 version I gave here shows the problem, but is no the
 platform on which the problem was diagnosed.

 Shachar

 --
 Shachar Shemesh
 Lingnu Open Source Consulting Ltd.
 http://www.lingnu.com




-- 

Ogden Nash  - The trouble with a kitten is that when it grows up,
it's always a cat. -
http://www.brainyquote.com/quotes/authors/o/ogden_nash.html

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: disabling loop unrolling in GCC

2009-12-21 Thread Dotan Shavit
On Monday 21 December 2009 14:00:39 Shachar Shemesh wrote:
 Where it gets worrying is when I try to cancel loop unrolling. I tried
  -fno-unroll-loops and -fno-peel-loops, to no effect. I even tried
  messing with the --param option (max-unrolled-insns, max-unroll-times,
  max-peel-times) to no noticeable effect
 
max-completely-peeled-insns is your friend

This param's value is also the difference between -O3 and -O2 you were missing

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: disabling loop unrolling in GCC

2009-12-21 Thread Aviv Greenberg
Also, i tried grepping for loop and then negate all loop related params:

linux-gec2:~/projects/lu # gcc -c -O3 -fno-align-loops
-fno-move-loop-invariants -fno-peel-loops -fno-prefetch-loop-arrays
-fno-rerun-cse-after-loop -fno-reschedule-modulo-scheduled-loops
-fno-tree-loop-im -fno-tree-loop-ivcanon -fno-tree-loop-linear
-fno-tree-loop-optimize -fno-tree-vect-loop-version
-fno-unroll-all-loops -fno-unroll-loops -fno-unsafe-loop-optimizations
-fno-unswitch-loops -fno-loop-optimize -fno-rerun-loop-opt  main.c

linux-gec2:~/projects/lu # objdump -S main.o

main.o: file format elf32-i386


Disassembly of section .text:

 func:
   0:   55  push   %ebp
   1:   31 c0   xor%eax,%eax
   3:   89 e5   mov%esp,%ebp
   5:   8d 76 00lea0x0(%esi),%esi
   8:   83 c0 01add$0x1,%eax
   b:   83 f8 07cmp$0x7,%eax
   e:   c7 05 00 00 00 00 00movl   $0x0,0x0
  15:   00 00 00
  18:   7e ee   jle8 func+0x8
  1a:   5d  pop%ebp
  1b:   c3  ret
  1c:   8d 74 26 00 lea0x0(%esi,%eiz,1),%esi

Just need to figure out which param is the good one :)

On Mon, Dec 21, 2009 at 16:41, Dotan Shavit do...@shavitos.com wrote:
 On Monday 21 December 2009 14:00:39 Shachar Shemesh wrote:
 Where it gets worrying is when I try to cancel loop unrolling. I tried
  -fno-unroll-loops and -fno-peel-loops, to no effect. I even tried
  messing with the --param option (max-unrolled-insns, max-unroll-times,
  max-peel-times) to no noticeable effect

 max-completely-peeled-insns is your friend

 This param's value is also the difference between -O3 and -O2 you were missing

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il




-- 

Charles de Gaulle  - The better I get to know men, the more I find
myself loving dogs. -
http://www.brainyquote.com/quotes/authors/c/charles_de_gaulle.html

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: disabling loop unrolling in GCC

2009-12-21 Thread Oleg Goldshmidt
2009/12/21 Shachar Shemesh shac...@shemesh.biz:
 Hi all,

 I'm trying, without success, to disable loop unrolling when compiling a
 program with -O3 with gcc (4.4, but I see the same problem with 4.3).

I am actually very surprized that -O3 unrolls loops. It is not
supposed to. The idea to include -funroll-loops into O3 was raised
quite a few times and was always rejected. Maybe something changed in
recent years. The documentation certainly does not say loop unrolling
is enabled with either -O2 or -O3.

I suspect something is the matter with -ftree-loop-optimize. The gcc
documentation says,

`-ftree-loop-optimize'
 Perform loop optimizations on trees.  This flag is enabled by
 default at `-O' and higher.

However, the behaviour depends on which optimization options you use.
E.g., -O2 won't unroll no matter what:

$ gcc -c -O2 -ftree-loop-optimize loop.c
$ objdump -S loop.o

loop.o: file format elf64-x86-64

Disassembly of section .text:

 func:
   0:   31 c0   xor%eax,%eax
   2:   66 0f 1f 44 00 00   nopw   0x0(%rax,%rax,1)
   8:   83 c0 01add$0x1,%eax
   b:   c7 05 00 00 00 00 00movl   $0x0,0x0(%rip)# 15 func+0x15
  12:   00 00 00
  15:   83 f8 08cmp$0x8,%eax
  18:   75 ee   jne8 func+0x8
  1a:   f3 c3   repz retq


However, try compiling with -O3 -fno-tree-loop-optimize and you will succeed.

$ gcc -c -O3 -fno-tree-loop-optimize loop.c
$ objdump -S loop.o

loop.o: file format elf64-x86-64

Disassembly of section .text:

 func:
   0:   31 c0   xor%eax,%eax
   2:   66 0f 1f 44 00 00   nopw   0x0(%rax,%rax,1)
   8:   83 c0 01add$0x1,%eax
   b:   c7 05 00 00 00 00 00movl   $0x0,0x0(%rip)# 15 func+0x15
  12:   00 00 00
  15:   83 f8 07cmp$0x7,%eax
  18:   7e ee   jle8 func+0x8
  1a:   f3 c3   repz retq

Or, if you are primarily interested in code size as you indicate, why not -Os?

$ gcc -c -Os loop.c
$ objdump -S loop.o

loop.o: file format elf64-x86-64

Disassembly of section .text:

 func:
   0:   31 c0xor%eax,%eax
   2:   ff c0   inc%eax
   4:   c7 05 00 00 00 00 00movl   $0x0,0x0(%rip)# e func+0xe
   b:   00 00 00
   e:   83 f8 08   cmp$0x8,%eax
  11:   75 ef   jne2 func+0x2
  13:   c3  retq

Hope it helps,

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: disabling loop unrolling in GCC

2009-12-21 Thread Shachar Shemesh

Dotan Shavit wrote:

On Monday 21 December 2009 14:00:39 Shachar Shemesh wrote:
  

Where it gets worrying is when I try to cancel loop unrolling. I tried
 -fno-unroll-loops and -fno-peel-loops, to no effect. I even tried
 messing with the --param option (max-unrolled-insns, max-unroll-times,
 max-peel-times) to no noticeable effect



max-completely-peeled-insns is your friend

This param's value is also the difference between -O3 and -O2 you were missing
  

Out of curiosity, how do you know that? Did you grep the gcc sources?

Shachar

--
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il