[Bug libgcc/66382] POWER8 Vector optimized implementation of __float128 (IEEE754 128-bit Binary Floating Point)

2016-03-28 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66382

--- Comment #7 from Michael Meissner  ---
I do not think it is worthwhile to expand the IEEE 128-bit software emulation
routines at the point of call in ISA 2.07 (power8).  This is due to the fact
that a lot of processing goes on in the emulation library.

Now if people were motivated, we could replace the soft-fp functions with tuned
versions written for the Power8 ISA.

However, I don't think it is going to be an easy task.  I'll list the things
that come to mind first.  They might be able to be done, but it is a
time/effort calculation of whether the return on investment is worth it.

The first issue is the next ISA (3.0) has support in it already for doing IEEE
128-bit floating point in hardware, including supporting the various rounding
modes, etc. If you configure and build gcc with an assembler that supports the
ISA 3.0 instruction set, it will add in IFUNC support so that when a program is
run on ISA 3.0 hardware, it will automatically use a version of __addkf3 that
uses the xsaddqp instruction instead of doing the emulation. So this effort
would only be for the current generation of hardware.

The next issue is right now you cannot do 64-bit scalar int arithmetic in the
VSX unit. At present, the compiler does not allow DImode into the Altivec
registers, but the only support for 64-bit integer arithmetic uses an Altivec
encoding and only uses Altivec registers (vaddudm, etc.).

I am working on patches to allow DImode variables in Altivec registers (and
later SImode, HImode, QImode). My first run with the patch shows 1 benchmark 2%
faster (perlbench) and one 2% slower (omnetpp), but I feel it needs a lot more
tuning.  At the moment, that work is lower priority, as I am trying to make
__float128 _Complex work as my highest priority (obviously other people could
take up this work).

After allowing DImode into Altivec registers, and perhaps doing 64-bit
arithmetic via the 64-bit integer vector instructions, another issue is that
the cycle time of the vector unit is 1/2 that of the GPR unit, so it will need
a lot of tuning.

I don't think the ISA 2.07 instruction set is general enough to do inserts and
extracts of 128-bit values that you would need for packing and unpacking the
IEEE 128-bit floating point values. ISA 3.0 has all of this support, including
specialized instructions to extract/set the exponent or mantissa, but then it
also has the hardware support for IEEE 128-bit floating point.

Load/stores are also problematical in ISA 2.07, given there is no d-form
addressing for Altivec registers.  So if you spill a DImode value in an Altivec
register, you need to load up the offset in a GPR to do the memory operation.

[Bug libgcc/66382] POWER8 Vector optimized implementation of __float128 (IEEE754 128-bit Binary Floating Point)

2016-03-26 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66382

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #6 from Segher Boessenkool  ---
Given the following program:

__float128 f(__float128 a, __float128 b) { return a*b; }

When compiled with:

gcc -Wall -W -O2 mulf.c -mfloat128 -mcpu=power8

you currently get (boring stuff cut out):

mflr 0
std 0,16(1)
stdu 1,-32(1)
bl __mulkf3
nop
addi 1,1,32
ld 0,16(1)
mtlr 0
blr

The task is to either optimise __mulkf3 to use vector math, or to
expand it inline even, where that make sense (this may more often
make sense with -ffast-math).

[Bug libgcc/66382] POWER8 Vector optimized implementation of __float128 (IEEE754 128-bit Binary Floating Point)

2016-03-21 Thread munroesj at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66382

Steven Munroe  changed:

   What|Removed |Added

 CC||meissner at gcc dot gnu.org

--- Comment #5 from Steven Munroe  ---
With GCC6.0 we have basis support for the __float128 type and libgcc has
soft-fp implementation for __float128.

This uses vector registers for __float128 parameter passing, but is implemented
as 64-bit Integer internally.

This is functional but not optimized.

Mike Meissner can help you with the gcc configurations

[Bug libgcc/66382] POWER8 Vector optimized implementation of __float128 (IEEE754 128-bit Binary Floating Point)

2016-03-14 Thread joseph at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66382

--- Comment #4 from joseph at codesourcery dot com  ---
On Mon, 14 Mar 2016, dan.parrot at mail dot com wrote:

> However, I am still unable to get gcc to compile a very simple program 
> when passed the -msoft-float option. Here is the program (test.c)

For 32-bit, -msoft-float is a distinct ABI from -mhard-float, and requires 
its own copies of libgcc, libc, etc.; you can't just build and run a 
soft-float program on a hard-float system without building all those 
separate libraries and arranging for the program to find them at link time 
and run time.

For 64-bit, -msoft-float is not supported, and hard-float is always 
available.  (__float128 might use software or hardware floating point - 
software before POWER9, hardware with POWER9 - but that's controlled by 
other options.)

[Bug libgcc/66382] POWER8 Vector optimized implementation of __float128 (IEEE754 128-bit Binary Floating Point)

2016-03-14 Thread dan.parrot at mail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66382

--- Comment #3 from dan.parrot at mail dot com ---
I am trying to configure gcc so that instead of generating instructions 
that use the hardware floating point unit, it will generate instructions 
that utilize integer operations to emulate floating point operations.

The description of the bug by David Edelsohn on Bugzilla implies that 
such emulation is currently available and would be made more efficient 
if integer operations are replaced by vector operations. It is this 
replacement that I'm trying to effect.

However, I am still unable to get gcc to compile a very simple program 
when passed the -msoft-float option. Here is the program (test.c)

==
#include 
#include 
#include 

int
main()
{
 printf("\n No. of bytes in a long double is %d.\n", sizeof(long 
double));

 long double x = 5.55L;
 long double y = -5.56L;

 long double z = x + y;
 long double w = x - y;
 printf("\n Sum   : \n x = %Lf \n y = %Lf \n z = %Lf \n", x, y, z);
 printf("\n Diff. : \n x = %Lf \n y = %Lf \n w = %llf \n", x, y, w);

 return (int)(z + w);
}
===

Without the -msoft-float flag to gcc, it prints the expected result.

With the -msoft-float flag, linking fails.

My questions is this: Can you successfully compile the simple program 
above in any version of gcc? If yes, what is the output of "gcc -v"? If 
not, does it mean there is in fact no software emulation available?

Thanks.
Dan.

On 03/14/2016 03:43 PM, munroesj at us dot ibm.com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66382
>
> --- Comment #2 from Steven Munroe  ---
> What is the issue? You want to configure __float128 without also configuring
> altivec/VMX/VSX?
>
> The PowerPC 64-bit ABI is defined to pass __float128 values in 128-bit vector
> registers and return _float128 values in VR2.
>
> How parameters are passed is independent of the computation.
>

[Bug libgcc/66382] POWER8 Vector optimized implementation of __float128 (IEEE754 128-bit Binary Floating Point)

2016-03-14 Thread munroesj at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66382

--- Comment #2 from Steven Munroe  ---
What is the issue? You want to configure __float128 without also configuring
altivec/VMX/VSX?

The PowerPC 64-bit ABI is defined to pass __float128 values in 128-bit vector
registers and return _float128 values in VR2.

How parameters are passed is independent of the computation.

[Bug libgcc/66382] POWER8 Vector optimized implementation of __float128 (IEEE754 128-bit Binary Floating Point)

2016-03-06 Thread dan.parrot at mail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66382

dan.parrot at mail dot com changed:

   What|Removed |Added

 CC||dan.parrot at mail dot com

--- Comment #1 from dan.parrot at mail dot com ---
Could you provide the options to ./configure which build gcc while allowing
flag -msoft-float and type __float128 to co-exist? Or is changing the code to
allow that combination part of the task here?

I have been unable to configure a build that accepts them (-msoft-float and
__float128) together.