On Sat, Feb 10, 2018 at 04:02:36PM -0500, Dennis Clarke wrote: > On 09/02/18 05:34 AM, John Paul Adrian Glaubitz wrote: > > On 02/09/2018 11:30 AM, Bas Vermeulen wrote: > > > mator on #debian-ports compiled gcc-7 for me with the attached patch. > > > With the resulting gcc, I compiled glibc and got a library I can use > > > sqrtf without running into an illegal instruction exception. > > > > > > Would it be possible to get this applied by default? The resulting > > > binaries work on e6500, and ought to work on all supported CPUs > > > for the ppc64 port. > > > > This is something that needs to be discussed. A single user alone shouldn't > > warrant such major change in a port. You always have to keep in mind that > > changing the default compiler options also has potential impact on the > > performance on more modern ppc64 systems like Apple Macintosh. > > > Not sure how modern an Apple Mac is but here is a photo I took only a > few minutes ago: > > https://i.imgur.com/6UbviKb.jpg > > > I have this old Mac G5 running as a fine example of a big-endian machine > and the PPC970MP processors in it seem to work very well. However it is > certainly becoming difficult to get results from it that can compare to > what I get from some other machines like Fujitsu SPARC for example. The > biggest complaint is with floating point wherein the data representation > may be actual IEEE 754-2008 style or some new IBM variant that I am not > at all familiar with. In fact, some code, trivial, won't compile at all > if I try to use "IEEE extended precision long double" with very few ways > to get around that : > > gcc -mcpu=970 -mno-altivec -m64 -std=iso9899:1999 -Wfatal-errors \ > -pedantic-errors -mabi=ieeelongdouble ... > > The gcc that I am using claims to be : > > GNU C99 (Debian 7.2.0-17) version 7.2.1 20171205 (powerpc64-linux-gnu) > compiled by GNU C version 7.2.1 20171205, GMP version 6.1.2, > MPFR version 3.1.6, MPC version 1.0.3, isl version isl-0.18-GMP > > > I can take the exact same source of a trivial floating point test and > drop it on very very old sparc as well as a system running very up to > date Red Hat Enterprise Linux 7.4 with AMD Opterons. Also this old mac > g5 with its PPC970MP processors where I see wildly different results on > all of them. When I say "wildly" I mean to say that the in memory data > isn't even remotely the same given the same constant inputs. I know that > the x86 hardware is somewhat crippled ( a strange ten byte format ) in > this regard but I was quite surprised by what happens on the PPC970MP > processors when compared to sparc. Regardless what compiler I use on > the sparc ( very very old Sun and much newer Fujitsu ) with Solaris 10 > I always get nearly perfect results. The Debian PPC970MP produces close > results but again the in memory data is quite different. > > In any case there are people out there messing with these things for > various reasons ( educational even in that I do teach ) and it is quite > weird to have to say to a student that in the year 2018 don't expect > similar results across different machines when it comes to doing any > floating point math. > > Dennis > > ps: long boring stuff follows where numbers don't quite work > and libquadmath seems to be out of the question.
This is quite well known, for a long time, IBM on Power (not on mainframes) used a non IEEE format for long doubles. Actually these are two IEEE doubles "concatenated", so: - the mantissa is somewhat less precise, 2 times 53 bits instead of 112 - the exponent range is way smaller, in powers of 10 the range is roughly ±308 (same as double) instead of ±4932. The fact the the in memory representation is completely different is not surprising when you take this into account. This was somewhat faster than a full emulation of IEEE quad math, but now IBM has switched to real IEEE quad (in hardware even on Power9, I suspect most Sparc do it in software). For more details, you may have a look at: https://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format there is even a full paragraph on the double-double arithmetic. I'm away from my Power machine right now and it is switched off, so I can't try your code and play with compiler options. Cheers, Gabriel > > ----- feel free to compile this on anything and show results ------ > > #define _XOPEN_SOURCE 600 > > #include <stdio.h> > #include <stdlib.h> > #include <math.h> > #include <locale.h> > #include <sys/utsname.h> > > int main (int argc, char* argv[]){ > > int j; > struct utsname uname_data; > long double theta, pi, approx_pi, one_over_sqrt2, ld_error; > > setlocale( LC_MESSAGES, "C" ); > if ( uname( &uname_data ) < 0 ) { > fprintf ( stderr, > "WARNING : Could not attain system uname data.\n" ); > perror ( "uname" ); > } else { > printf (" system name = %s\n", uname_data.sysname ); > printf (" node name = %s\n", uname_data.nodename ); > printf (" release = %s\n", uname_data.release ); > printf (" version = %s\n", uname_data.version ); > printf (" machine = %s\n", uname_data.machine ); > } > printf ("\n"); > > /* plenty of digits well past the precision of binary128 */ > pi = 3.1415926535897932384626433832795028841971693993751L; > > printf("sizeof(long double) = %2i\n", sizeof(long double)); > printf(" pi may be %+40.38Lf\n", pi); > printf("reference val = "); > printf("+3.1415926535897932384626433832795028841971693993751\n\n"); > > printf("%p : ", &pi); > for ( j=0; j<sizeof(long double); j++ ) > printf("%02x ", ((unsigned char *)&pi)[j] ); > printf("\n\n" ); > > ld_error = (long double) > 3.1415926535897932384626433832795028841971693993751L > - pi; > printf(" ld_error = %+40.38Lf\n\n", ld_error); > > printf("sinl(pi) may be %+40.38Lf\n", sinl(pi)); > > approx_pi = (long double) 4.0L * atanl( (long double) 1.0L); > printf(" approx_pi = %+40.38Lf\n", approx_pi); > ld_error = (long double) > 3.1415926535897932384626433832795028841971693993751L > - approx_pi; > > printf(" ld_error = %+40.38Lf\n\n", ld_error); > > theta = pi / ( (long double) 4.0L); > printf(" theta = %+40.38Lf\n", theta); > one_over_sqrt2 = sinl(theta); > printf(" sinl(theta) = %+40.38Lf\n", one_over_sqrt2); > > ld_error = (long double) > 0.7071067811865475244008443621048490392848359376884L > - one_over_sqrt2; > > printf(" ld_error = %+40.38Lf\n\n", ld_error); > > return EXIT_SUCCESS; > > } > > EOF > If you copy and paste that correctly you should have sha256 hash : > > 836282023b62d3a09b6ad59424951d873b965a594f23e6c41d596c4845f74d5d > > dc@n0$ psrinfo -pv > The physical processor has 8 virtual processors (0-7) > SPARC64-VII+ (portid 1024 impl 0x7 ver 0xa1 clock 2860 MHz) > dc@n0$ /usr/local/gcc6/bin/gcc --version > gcc (genunix Wed Jul 26 02:41:24 GMT 2017) 6.4.0 > Copyright (C) 2017 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > dc@n0$ /usr/local/gcc6/bin/gcc -m64 -std=iso9899:1999 -Wfatal-errors > -pedantic-errors -o s s.c -lm > dc@n0$ ./s > system name = SunOS > node name = node000 > release = 5.10 > version = Generic_150400-59 > machine = sun4u > > sizeof(long double) = 16 > pi may be +3.14159265358979323846264338327950279748 > reference val = +3.1415926535897932384626433832795028841971693993751 > > ffffffff7fffeed0 : 40 00 92 1f b5 44 42 d1 84 69 89 8c c5 17 01 b8 > > ld_error = +0.00000000000000000000000000000000000000 > > sinl(pi) may be +0.00000000000000000000000000000000008672 > approx_pi = +3.14159265358979323846264338327950279748 > ld_error = +0.00000000000000000000000000000000000000 > > theta = +0.78539816339744830961566084581987569937 > sinl(theta) = +0.70710678118654752440084436210484899217 > ld_error = +0.00000000000000000000000000000000000000 > > > however .... > > ppc_nix$ > ppc_nix$ gcc --version > gcc (Debian 7.2.0-17) 7.2.1 20171205 > Copyright (C) 2017 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > ppc_nix$ grep "^cpu" /proc/cpuinfo > cpu : PPC970MP, altivec supported > cpu : PPC970MP, altivec supported > cpu : PPC970MP, altivec supported > cpu : PPC970MP, altivec supported > ppc_nix$ > > ppc_nix$ openssl dgst -sha256 s.c > SHA256(s.c)= > 836282023b62d3a09b6ad59424951d873b965a594f23e6c41d596c4845f74d5d > > ppc_nix$ gcc -mcpu=970 -mno-altivec -m64 -std=iso9899:1999 -Wfatal-errors > -pedantic-errors -mabi=ieeelongdouble -o s s.c -lm > gcc: warning: using IEEE extended precision long double > cc1: warning: using IEEE extended precision long double > /tmp/cc348kuM.o: In function `main': > s.c:(.text+0x26c): undefined reference to `_q_sub' > s.c:(.text+0x3ac): undefined reference to `_q_sub' > s.c:(.text+0x424): undefined reference to `_q_div' > s.c:(.text+0x4ec): undefined reference to `_q_sub' > collect2: error: ld returned 1 exit status > ppc_nix$ > > ppc_nix$ gcc -mcpu=970 -mno-altivec -m64 -std=iso9899:1999 -Wfatal-errors > -pedantic-errors -mabi=ibmlongdouble -o s s.c -lm > gcc: warning: using IBM extended precision long double > cc1: warning: using IBM extended precision long double > ppc_nix$ ./s > system name = Linux > node name = nix > release = 4.13.0-1-powerpc64 > version = #1 SMP Debian 4.13.13-1 (2017-11-16) > machine = ppc64 > > sizeof(long double) = 16 > pi may be +3.14159265358979323846264338327948122706 > reference val = +3.1415926535897932384626433832795028841971693993751 > > 0x7fffc9d0c230 : 40 09 21 fb 54 44 2d 18 3c a1 a6 26 33 14 5c 06 > > ld_error = +0.00000000000000000000000000000000000000 > > sinl(pi) may be +0.00000000000000000000000000000002165713 > approx_pi = +3.14159265358979323846264338327948122706 > ld_error = +0.00000000000000000000000000000000000000 > > theta = +0.78539816339744830961566084581987030677 > sinl(theta) = +0.70710678118654752440084436210483464400 > ld_error = +0.00000000000000000000000000000000616298 > > ppc_nix$ > > > A twenty year old sparc gives better results when using gcc 7.2.0 : > > mimas $ psrinfo -pv > The physical processor has 1 virtual processor (0) > UltraSPARC-IIe (portid 0 impl 0x13 ver 0x14 clock 500 MHz) > > mimas $ /usr/local/gcc7/bin/gcc --version > gcc (genunix Tue Aug 29 11:48:17 GMT 2017) 7.2.0 > Copyright (C) 2017 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > mimas $ > > mimas $ openssl dgst -sha256 s.c > SHA256(s.c)= > 836282023b62d3a09b6ad59424951d873b965a594f23e6c41d596c4845f74d5d > > mimas $ /usr/local/gcc7/bin/gcc -m64 -std=iso9899:1999 -Wfatal-errors > -pedantic-errors -o s s.c -lm > mimas $ ./s > system name = SunOS > node name = mimas > release = 5.10 > version = Generic_150400-57 > machine = sun4u > > sizeof(long double) = 16 > pi may be +3.14159265358979323846264338327950279748 > reference val = +3.1415926535897932384626433832795028841971693993751 > > ffffffff7ffff0a0 : 40 00 92 1f b5 44 42 d1 84 69 89 8c c5 17 01 b8 > > ld_error = +0.00000000000000000000000000000000000000 > > sinl(pi) may be +0.00000000000000000000000000000000008672 > approx_pi = +3.14159265358979323846264338327950279748 > ld_error = +0.00000000000000000000000000000000000000 > > theta = +0.78539816339744830961566084581987569937 > sinl(theta) = +0.70710678118654752440084436210484899217 > ld_error = +0.00000000000000000000000000000000000000 > > mimas $ > > Other than the memory address this is bit for bit exact same as the > newer Fujitsu server. I was hoping to see the exact same from the > mac PPC970MP based unit. >

