Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-04 Thread J. Gareth Moreton
I might hold on this for a little bit until I get more out of my node outputting feature, since I need to study the nodes produced by an inlined Floor function carefully.  For example, Floor's formal parameter is further passed separately into Trunc and Frac - normally it's not a problem, but if

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-04 Thread Florian Klämpfl
Am 04.02.19 um 17:47 schrieb J. Gareth Moreton: Oh whoops, sorry about that and not replying to the list. I'll try not to screw up.  Generally I think Double is preferred because then everything uses SSE2 and no awkward ferrying of data between it and the floating-point stack is required

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-04 Thread J. Gareth Moreton
Oh whoops, sorry about that and not replying to the list. I'll try not to screw up.  Generally I think Double is preferred because then everything uses SSE2 and no awkward ferrying of data between it and the floating-point stack is required (come to think of it, only Win64 actually requires the

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-04 Thread Sven Barth via fpc-devel
Am Mo., 4. Feb. 2019, 14:15 hat J. Gareth Moreton geschrieben: > Oh right, okay, so x86_64-win64 is Double (even though Extended is > supported), but other x86_64 platforms are Extended, right? A little bit > odd, but I'll keep an eye out in that case. > Correct. Though Extended is not

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-04 Thread Sven Barth via fpc-devel
Am So., 3. Feb. 2019, 18:29 hat J. Gareth Moreton geschrieben: > To reassure, I'm aware that "float" is normally "extended" outside of > x86_64, and I would keep my changes constrained to that platform. > This statement is not correct: the default floating point type for nearly all of FPC's

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread J. Gareth Moreton
I'll see what I can put together.  Personally I'd prefer it if the source code contained something that says "call this internal procedure" or "insert this special node", although that would require some careful design. Gareth aka. Kit On Sun 03/02/19 22:05 , Florian Klämpfl

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread Florian Klämpfl
Am 03.02.19 um 21:52 schrieb J. Gareth Moreton: It just seems highly dependent on the source code and can easily break if it's changed... and not just the Floor function, but also possibly if Trunc and Frac are modified in some way.  The code does boil down to two instructions in SSE 4.1 and

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread Tomas Hajny
On Sun, February 3, 2019 17:27, J. Gareth Moreton wrote: . . > To reassure, I'm aware that "float" is normally "extended" outside of > x86_64, and I would keep my changes constrained to that platform. . . Just to make sure - did you notice that even on x86_64, Extended is still used for

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread J. Gareth Moreton
It just seems highly dependent on the source code and can easily break if it's changed... and not just the Floor function, but also possibly if Trunc and Frac are modified in some way.  The code does boil down to two instructions in SSE 4.1 and AVX, but it depends on many different nodes with an

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread Florian Klämpfl
Am 03.02.19 um 22:29 schrieb Jonas Maebe: On 03/02/19 21:26, J. Gareth Moreton wrote: One thing that I should ask though... if a unit like Math is compiled with -fAVX, then another project that uses it is built without any special floating-point types, is Math recompiled or will it use the

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread J. Gareth Moreton
Aah, now I see how it can be a little problematic and why node optimisation is a better approach.  Hmmm, this might take some thought. Gareth aka. Kit On Sun 03/02/19 21:29 , Jonas Maebe jo...@freepascal.org sent: On 03/02/19 21:26, J. Gareth Moreton wrote: > One thing that I should ask

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread Jonas Maebe
On 03/02/19 21:26, J. Gareth Moreton wrote: One thing that I should ask though... if a unit like Math is compiled with -fAVX, then another project that uses it is built without any special floating-point types, is Math recompiled or will it use the code already built, thereby possibly putting

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread J. Gareth Moreton
I would like to improve more of the mathematical functions, but unless some of them are promoted to internal functions, having micro-optimisations in the code feels very bloated and will be a maintenance nightmare due to the amount of interdependency - for example, things like the floor function

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread J. Gareth Moreton
It's certainly possible, but feels a little finnicky, since floor64 is not an internal function unlike, say, the trigonometric functions.  It will also break if the original code is changed.  It feels like a kludge, especially if another programmer down the line tries to rewrite the function and

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread Florian Klämpfl
Am 03.02.19 um 06:26 schrieb J. Gareth Moreton: Hi everyone, So I'm looking to improve some of the mathematical routines.  However, not all of them are internal functions and are stored in the Math unit..  Some of them are written in assembly language but use the old floating-point stack, or

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread Jonas Maebe
On 03/02/19 06:26, J. Gareth Moreton wrote: And similarly with AVX.  Even if we're stuck with just SSE2, assembler improvements can be made over the Pascal code just by removing the call to Trunc and replacing it with its single assembler command: "cvttsd2si %xmm0,%rax". The compiler already

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread Bart
On Sun, Feb 3, 2019 at 7:28 AM J. Gareth Moreton wrote: > {$ASMMODE ATT} > function floor64(x: float): Int64; assembler; nostackframe; > asm > roundsd %xmm0, %xmm0, $0b101 { Round towards negative infinity } > cvttsd2si %xmm0, %rax { Convert to integer... equivalent to Trunc() } >

Re: [fpc-devel] Detecting SSE and AVX compiler options

2019-02-03 Thread Sven Barth via fpc-devel
Am So., 3. Feb. 2019, 07:28 hat J. Gareth Moreton geschrieben: > As an example of a function that can benefit from a speed-up under > x86_64... the floor() and floor64() functions: > > function floor64(x: float): Int64; > begin > Result:=Trunc(x)-ord(Frac(x)<0); > end; > Please keep in