[Bug c++/66239] Unoptimized sqrt(float or double) returns wrong values for ARM Cortex-A8 -mfloat-abi=[soft,softfp]

2015-06-08 Thread maciej.andrzejewski at data dot pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66239

Maciej Andrzejewski  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #5 from Maciej Andrzejewski  ---
Libm and other libs have been mixed. Properly compiled toolchain with libs
solved the problem.
Thank you all for assistance!


[Bug c++/66239] Unoptimized sqrt(float or double) returns wrong values for ARM Cortex-A8 -mfloat-abi=[soft,softfp]

2015-06-08 Thread maciej.andrzejewski at data dot pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66239

--- Comment #6 from Maciej Andrzejewski  ---
Libm and other libs have been mixed. Properly compiled toolchain with libs
solved the problem.
Thank you all for assistance!


[Bug c++/66239] Unoptimized sqrt(float or double) returns wrong values for ARM Cortex-A8 -mfloat-abi=[soft,softfp]

2015-05-21 Thread maciej.andrzejewski at data dot pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66239

--- Comment #1 from Maciej Andrzejewski  ---
It is getting even more interesting.

I have disassabled 4 binaries compiled with options:
1) -mfloat-abi=softfp
2) -mfloat-abi=softfp -O
3) -mfloat-abi=hard
4) -mfloat-abi=hard -O

and from what I understand if we turn ON the optimization the FPU is turned
OFF!
I dont see in assembler that FPU s** registers are used in those two cases
where optimization is turned on:

-- DISASSAMBLE OPTION 1 --
00010570 :
   10570:   e92d4800push{fp, lr}
   10574:   e28db004add fp, sp, #4
   10578:   e24dd040sub sp, sp, #64 ; 0x40
   1057c:   e3032333movwr2, #13107  ; 0x
   10580:   e3432333movtr2, #13107  ; 0x
   10584:   e303movwr3, #13107  ; 0x
   10588:   e3443022movtr3, #16418  ; 0x4022
   1058c:   e14b20fcstrdr2, [fp, #-12]
   10590:   ed5b0b03vldrd16, [fp, #-12]
   10594:   eef77be0vcvt.f32.f64s15, d16
   10598:   ee170a90vmovr0, s15
   1059c:   eba1bl  10428 
   105a0:   ee070a90vmovs15, r0
   105a4:   eef70ae7vcvt.f64.f32d16, s15
   105a8:   ed4b0b05vstrd16, [fp, #-20] ; 0xffec
   105ac:   e30006ccmovwr0, #1740   ; 0x6cc
   105b0:   e341movtr0, #1
   105b4:   e14b21d4ldrdr2, [fp, #-20]  ; 0xffec
   105b8:   eba0bl  10440 
   105bc:   e309399amovwr3, #39322  ; 0x999a
   105c0:   e3443111movtr3, #16657  ; 0x4111
   105c4:   e50b3018str r3, [fp, #-24]  ; 0xffe8
   105c8:   e51b0018ldr r0, [fp, #-24]  ; 0xffe8
   105cc:   eb95bl  10428 
   105d0:   ee070a90vmovs15, r0
   105d4:   eef70ae7vcvt.f64.f32d16, s15
   105d8:   ed4b0b09vstrd16, [fp, #-36] ; 0xffdc
   105dc:   e30006ccmovwr0, #1740   ; 0x6cc
   105e0:   e341movtr0, #1
   105e4:   e14b22d4ldrdr2, [fp, #-36]  ; 0xffdc
   105e8:   eb94bl  10440 
   105ec:   e3032333movwr2, #13107  ; 0x
   105f0:   e3432333movtr2, #13107  ; 0x
   105f4:   e303movwr3, #13107  ; 0x
   105f8:   e3443022movtr3, #16418  ; 0x4022
   105fc:   e14b22fcstrdr2, [fp, #-44]  ; 0xffd4
   10600:   e14b02dcldrdr0, [fp, #-44]  ; 0xffd4
   10604:   eb8abl  10434 
   10608:   e14b03f4strdr0, [fp, #-52]  ; 0xffcc
   1060c:   e30006ccmovwr0, #1740   ; 0x6cc
   10610:   e341movtr0, #1
   10614:   e14b23d4ldrdr2, [fp, #-52]  ; 0xffcc
   10618:   eb88bl  10440 
   1061c:   e309399amovwr3, #39322  ; 0x999a
   10620:   e3443111movtr3, #16657  ; 0x4111
   10624:   e50b3038str r3, [fp, #-56]  ; 0xffc8
   10628:   ed5b7a0evldrs15, [fp, #-56] ; 0xffc8
   1062c:   eef70ae7vcvt.f64.f32d16, s15
   10630:   ec510b30vmovr0, r1, d16
   10634:   eb7ebl  10434 
   10638:   e14b04f4strdr0, [fp, #-68]  ; 0xffbc
   1063c:   e30006ccmovwr0, #1740   ; 0x6cc
   10640:   e341movtr0, #1
   10644:   e14b24d4ldrdr2, [fp, #-68]  ; 0xffbc
   10648:   eb7cbl  10440 
   1064c:   e3a03000mov r3, #0
   10650:   e1a3mov r0, r3
   10654:   e24bd004sub sp, fp, #4
   10658:   e8bd8800pop {fp, pc}
-- DISASSAMBLE OPTION 1 --



-- DISASSAMBLE OPTION 2 --
000104f4 :
   104f4:   e92d40d0push{r4, r6, r7, lr}
   104f8:   e30045d4movwr4, #1492   ; 0x5d4
   104fc:   e3404001movtr4, #1
   10500:   e3a06000mov r6, #0
   10504:   e302720amovwr7, #8714   ; 0x220a
   10508:   e3447008movtr7, #16392  ; 0x4008
   1050c:   e1a4mov r0, r4
   10510:   e1a02006mov r2, r6
   10514:   e1a03007mov r3, r7
   10518:   eba9bl  103c4 
   1051c:   e1a4mov r0, r4
   10520:   e1a02006mov r2, r6
   10524:   e1a03007mov r3, r7
   10528:   eba5bl  103c4 
   1052c:   e1a4mov r0, r4
   10530:   e30f2d38movwr2, #64824  ; 0xfd38
   10534:   e34f2ea1movtr2, #65185  ; 0xfea1
   10538:   e3023209movwr3, #8713   ; 0x2209
   

[Bug c++/66239] New: Unoptimized sqrt(float or double) returns wrong values for ARM Cortex-A8 -mfloat-abi=[soft,softfp]

2015-05-21 Thread maciej.andrzejewski at data dot pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66239

Bug ID: 66239
   Summary: Unoptimized sqrt(float or double) returns wrong values
for ARM Cortex-A8 -mfloat-abi=[soft,softfp]
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: maciej.andrzejewski at data dot pl
  Target Milestone: ---

Tested both on GCC 4.9.1 (compiled toolchain using crosstool-ng) and 4.9.2
(compiled toolchain using buildroot).

Simple code:

-- CODE --
#include 
#include 
#include 


int main( void )
{
double sq3 = 9.1;
double ret3 = sqrtf(sq3);
printf("%f\n", ret3);

float sq4 = 9.1;
double ret4 = sqrtf(sq4);
printf("%f\n", ret4);

double sq1 = 9.1;
double ret1 = sqrt(sq1);
printf("%f\n", ret1);

float sq2 = 9.1;
double ret2 = sqrt(sq2);
printf("%f\n", ret2);

return 0;
}
-- CODE --

compiled with command:
g++ -mcpu=cortex-a8 -march=armv7-a -mtune=cortex-a8 -mfpu=neon
-mthumb-interwork -mfpu=neon -Wall -Wextra -mfloat-abi=softfp

or

g++ -mcpu=cortex-a8 -march=armv7-a -mtune=cortex-a8 -mfpu=neon 
-mthumb-interwork -mfpu=neon -Wall -Wextra -mfloat-abi=soft

prints output when run:
-- OUTPUT --
0.00
0.00
89884613882771507935772421602449274280826426490922860415370742828850803088708436568909338933268615382725731836148779976703521876921396883553861971381426763394584730974161643341227168116324626810241973964225774272233175406843466371141886318608237834273423597057507238373804952327520531541920130815989439791104.00
89884613882771507935772421602449274280826426490922860415370742828850803088708436568909338933268615382725731836148779976703521876921396883553861971381426763394584730974161643341227168116324626810241973964225774272233175406843466371141886318608237834273423597057507238373804952327520531541920130815989439791104.00
-- OUTPUT --

when compiled with added ANY optimization flag (O, O1, O2, O3, Og) prints
proper output:
-- OUTPUT 2 --
3.016621
3.016621
3.016621
3.016621
-- OUTPUT 2 --




*** Additional info ***

g++ --version:
arm-none-linux-gnueabi-g++ (Buildroot 2015.02) 4.9.2

readelf -a a.out:
-- READELF --
macieja@ubuntu:~/Projects/sqrtf$ readelf -a a.out
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class: ELF32
  Data:  2's complement, little endian
  Version:   1 (current)
  OS/ABI:UNIX - System V
  ABI Version:   0
  Type:  EXEC (Executable file)
  Machine:   ARM
  Version:   0x1
  Entry point address:   0x103d0
  Start of program headers:  52 (bytes into file)
  Start of section headers:  5012 (bytes into file)
  Flags: 0x5000202, has entry point, Version5 EABI,
soft-float ABI
  Size of this header:   52 (bytes)
  Size of program headers:   32 (bytes)
  Number of program headers: 8
  Size of section headers:   40 (bytes)
  Number of section headers: 29
  Section header string table index: 26

Section Headers:
  [Nr] Name  TypeAddr OffSize   ES Flg Lk Inf
Al
  [ 0]   NULL 00 00 00  0   0 
0
  [ 1] .interp   PROGBITS00010134 000134 13 00   A  0   0 
1
  [ 2] .note.ABI-tag NOTE00010148 000148 20 00   A  0   0 
4
  [ 3] .hash HASH00010168 000168 38 04   A  4   0 
4
  [ 4] .dynsym   DYNSYM  000101a0 0001a0 90 10   A  5   1 
4
  [ 5] .dynstr   STRTAB  00010230 000230 d3 00   A  0   0 
1
  [ 6] .gnu.version  VERSYM  00010304 000304 12 02   A  4   0 
2
  [ 7] .gnu.version_rVERNEED 00010318 000318 40 00   A  5   2 
4
  [ 8] .rel.dyn  REL 00010358 000358 08 08   A  4   0 
4
  [ 9] .rel.plt  REL 00010360 000360 20 08  AI  4  11 
4
  [10] .init PROGBITS00010380 000380 0c 00  AX  0   0 
4
  [11] .plt  PROGBITS0001038c 00038c 44 04  AX  0   0 
4
  [12] .text PROGBITS000103d0 0003d0 0001f8 00  AX  0   0 
4
  [13] .fini PROGBITS000105c8 0005c8 08 00  AX  0   0 
4
  [14] .rodata   PROGBITS000105d0 0005d0 08 00   A  0   0 
4
  [15] .ARM.exidxARM_EXIDX   000105d8 0005d8 18 00  AL 12   0 
4
  [16] .eh_frame PROGBITS000105f0 0005f0 04 00   A  0   0 
4
  [17] .init_array   INIT_ARRAY  000205f4 0005f4 04 00  WA  0   0 
4
  [18] .fini_array   FINI_ARRAY  000205