------- Comment #2 from pinskia at gmail dot com  2010-05-14 13:10 -------
Subject: Re:   New: Unneeded +0.0 for c = 0.0 ; c = c+ a*b



Sent from my iPhone

On May 14, 2010, at 2:18 AM, "tkoenig at gcc dot gnu dot org"
<gcc-bugzi...@gcc.gnu.org 
 > wrote:

> This code leads to the adding of 0.0, which is a nop.  Any
> signalling should have been done previously.

It is not signalling that matters here but signed zero. 0.0 + -0.0 ==  
0.0. So without the 0.0 +, you can get a negative zero.


>
> i...@linux-fd1f:/tmp> cat mult.f90
> subroutine foo(a,b,c)
>  real, intent(in) :: a,b
>  real, intent(out) :: c
>  c = 0.0
>  c = c + a*b
> end subroutine foo
> i...@linux-fd1f:/tmp> gfortran -O3 -fdump-tree-optimized -S mult.f90
> i...@linux-fd1f:/tmp> cat mult.f90.142t.optimized
>
> ;; Function foo (foo_)
>
> foo (real(kind=4) & restrict a, real(kind=4) & restrict b, real 
> (kind=4) &
> restrict c)
> {
>  real(kind=4) D.1542;
>  real(kind=4) D.1541;
>  real(kind=4) D.1540;
>  real(kind=4) D.1539;
>
> <bb 2>:
>  D.1539_4 = *a_3(D);
>  D.1540_6 = *b_5(D);
>  D.1541_7 = D.1539_4 * D.1540_6;
>  D.1542_8 = D.1541_7 + 0.0;
>  *c_1(D) = D.1542_8;
>  return;
>
> }
>
> i...@linux-fd1f:/tmp> cat mult.s
>        .file   "mult.f90"
>        .text
>        .p2align 4,,15
> .globl foo_
>        .type   foo_, @function
> foo_:
> .LFB0:
>        movss   (%rdi), %xmm0
>        mulss   (%rsi), %xmm0
>        addss   .LC0(%rip), %xmm0
>        movss   %xmm0, (%rdx)
>        ret
> .LFE0:
>        .size   foo_, .-foo_
>        .section        .rodata.cst4,"aM",@progbits,4
>        .align 4
> .LC0:
>        .long   0
>        .section        .eh_frame,"a",@progbits
> .Lframe1:
>        .long   .LECIE1-.LSCIE1
> .LSCIE1:
>        .long   0
>        .byte   0x1
>        .string "zR"
>        .uleb128 0x1
>        .sleb128 -8
>        .byte   0x10
>        .uleb128 0x1
>        .byte   0x3
>        .byte   0xc
>        .uleb128 0x7
>        .uleb128 0x8
>        .byte   0x90
>        .uleb128 0x1
>        .align 8
> .LECIE1:
> .LSFDE1:
>        .long   .LEFDE1-.LASFDE1
> .LASFDE1:
>        .long   .LASFDE1-.Lframe1
>        .long   .LFB0
>        .long   .LFE0-.LFB0
>        .uleb128 0
>        .align 8
> .LEFDE1:
>        .ident  "GCC: (GNU) 4.6.0 20100513 (experimental)"
>        .section        .note.GNU-stack,"",@progbits
>
>
> -- 
>           Summary: Unneeded +0.0 for c = 0.0 ; c = c+ a*b
>           Product: gcc
>           Version: 4.6.0
>            Status: UNCONFIRMED
>          Keywords: missed-optimization
>          Severity: enhancement
>          Priority: P3
>         Component: middle-end
>        AssignedTo: unassigned at gcc dot gnu dot org
>        ReportedBy: tkoenig at gcc dot gnu dot org
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44134
>


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44134

Reply via email to