Re: [fpc-devel] Detecting SSE and AVX compiler options

J. Gareth Moreton Tue, 05 Feb 2019 19:41:15 -0800

 This might prove quite complicated to implement.  Writing a similar test
function and modifying my node dump to get what I need (the node dump tool
does its work before the second pass, so inlined functions aren't
expanded), I get a very complicated node tree for a single inlined function
call to "floor" with a parameter of type Single:


 ----

 ...
 ......
 .........2
 .........SmallInt
 .........$000000005E511570
 .........
 ......
 ...
 ...
 ......
 .........8
 .........Double
 .........$000000005E5114F0
 .........
 ......
 ...
 ...
 ......
 .........
 ............Double
 ............$000000005E5114F0
 ............ti_may_be_in_reg
 .........
 ......
 ......
 .........
 ............
 ...............TESTSINGLES
 ............
 .........
 .........
 ............
 ...............X
 ............
 .........
 ......
 ...
 ...
 ......
 .........
 ............
 ...............SmallInt
 ...............$000000005E511570
 ...............ti_may_be_in_reg
 ............
 .........
 ......
 ......
 .........
 ............
 ...............
 ..................Double
 ..................$000000005E5114F0
 ..................ti_may_be_in_reg
 ...............
 ............
 .........
 .........
 ............
 ...............
 ..................
 .....................

$fpc_frac_real(Double):Double;
 .....................
 ........................
 ........................
 ...........................Double
 ...........................$000000005E5114F0
 ...........................ti_may_be_in_reg
 ........................
 .....................
 ..................
 ...............
 ...............
 .................. 0.0000000000000000E+000
 ...............
 ............
 .........
 ......
 ...
 ...
 ......
 .........FALSE
 .........Double
 .........tt_persistent
 .........$000000005E5114F0
 ......
 ...
 ...
 ......
 .........TRUE
 .........SmallInt
 .........tt_persistent
 .........$000000005E511570
 ......
 ...
 ...
 ......
 .........SmallInt
 .........$000000005E511570
 .........ti_may_be_in_reg
 ......
 ...

 ----

 Because it's a test function, the result type is SmalInt rather than
LongInt due to the compiler options, but the effect is the same.  Most of
the attributes and flags I can ignore, but it's going to be a mammoth task
to check all of these nodes and confirm that the function is what it's
meant to be.  Don't get me wrong, it can be done, but I'm worried it will
take the compiler a disproportionately long time doing so.

 Still, at the same time, this is an example where the node dump is useful,
if still needing some work.

 Gareth aka. Kit

 On Mon 04/02/19 19:28 , "J. Gareth Moreton" gar...@moreton-family.com
sent:
  I might hold on this for a little bit until I get more out of my node
outputting feature, since I need to study the nodes produced by an inlined
Floor function carefully.  For example, Floor's formal parameter is
further passed separately into Trunc and Frac - normally it's not a
problem, but if the actual parameter is a complex expression (i.e. isn't a
simple constant or variable), then it may produce even more nodes as it's
calculated twice, once for Trunc and once for Frac... or it's computed
beforehand and put into a temporary store that's hidden from the
programmer.  I won't know for sure until I study the nodes and make a good
contingency.

 I'll likely make 3 versions of the floor function (not including the
Pascal version that already exists, which the compiler can fall back on if
it's dealing with the "Extended" type, for example), one that uses SSE2,
one that uses SSE4.1 (which introduces the ROUNDSD instruction) and one
that uses AVX (which is effectively identical to the SSE4.1 one, albeit
using the AVX functions).
 The node optimisation is definitely the better choice, thinking about it
now, also because if the compiler determines that the parameters are of
type Single, it can use the single-precision SSE instructions rather than
converting from Single to Double and back again.  I just feel like this is
possibly a little bloated because it's the kind of optimisation that
belongs to an internal function rather than one in a supplementary unit...
unless you want to promote "floor" and similar functions from the Math unit
into internal functions through the System unit.

 This is proving to be a fascinating learning experience, not just of
coding but also of design and discussion!

 Gareth aka. Kit

 On Mon 04/02/19 20:04 , "Florian Klämpfl" flor...@freepascal.org sent:
 Am 04.02.19 um 17:47 schrieb J. Gareth Moreton: 
 > Oh whoops, sorry about that and not replying to the list. 
 > 
 > I'll try not to screw up.  Generally I think Double is preferred
because 
 > then everything uses SSE2 and no awkward ferrying of data between it and

 > the floating-point stack is required (come to think of it, only Win64 
 > actually requires the presence of SSE2 and refuses to install if it's 
 > not present). 
 > 
 > Given that Florian prefers a node micro-optimisation for functions like 
 > floor, it should be easy enough to check if the input is of type Single 
 > or Double, and drop out if it's Extended (falling back to the actual 
 > source code). 

 Well, in case of a node optimization in combination with inline I do not 
 see it as a real micro optimization as it results in the best code which 
 is not the case if it is ifdef'ed assembler code in a unit which is most 
 of the time not used (fpc x86-64 rtl is build with -Cfsse2 normally). 
 _______________________________________________ 
 fpc-devel maillist -  
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[1]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel 

  _______________________________________________
 fpc-devel maillist - fpc-devel@lists.freepascal.org
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[2]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

 

Links:
------
[1] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[2] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Detecting SSE and AVX compiler options

Reply via email to