Re: [fpc-devel] Division nodes

J. Gareth Moreton via fpc-devel Thu, 11 May 2023 11:42:21 -0700

This is the code block in question (ncnv.pas, starting at line 3397) -if anyone can explain why it has to be set up this way, or add commentsto the code, I will be most grateful (it's run for the following nodetypes: subn, addn, muln, divn, modn, xorn, andn, orn, shln, shrn):


  exclude(n.flags,nf_internal);
  if not forceunsigned and
     is_signed(n.resultdef) then
    begin
      originaldivtree:=nil;
      if n.nodetype in [divn,modn] then
        originaldivtree:=n.getcopy;
doremoveinttypeconvs(level+1,tbinarynode(n).left,signedtype,false,signedtype,unsignedtype);
doremoveinttypeconvs(level+1,tbinarynode(n).right,signedtype,false,signedtype,unsignedtype);
      n.resultdef:=signedtype;
      if n.nodetype in [divn,modn] then
        begin
          newblock:=internalstatements(newstatements);
tempnode:=ctempcreatenode.create(n.resultdef,n.resultdef.size,tt_persistent,true);
          addstatement(newstatements,tempnode);
          addstatement(newstatements,cifnode.create_internal(
caddnode.create_internal(equaln,tbinarynode(n).right.getcopy,cordconstnode.create(-1,n.resultdef,false)),
              cassignmentnode.create_internal(
                ctemprefnode.create(tempnode),
cmoddivnode.create(n.nodetype,tbinarynode(originaldivtree).left.getcopy,cordconstnode.create(-1,tbinarynode(originaldivtree).right.resultdef,false))
              ),
              cassignmentnode.create_internal(
                ctemprefnode.create(tempnode),n
              )
            )
          );
addstatement(newstatements,ctempdeletenode.create_normal_temp(tempnode));
addstatement(newstatements,ctemprefnode.create(tempnode));
          n:=newblock;
          do_typecheckpass(n);
          originaldivtree.free;
        end;
    end


(the new division/modulus by -1 is then converted elsewhere)

Kit

On 11/05/2023 18:01, J. Gareth Moreton via fpc-devel wrote:

P.S. I found the code that adds the conditional checks; it's"doremoveinttypeconvs" in the ncnv unit. However, it's very unclearas to WHY it's doing it as there's no comments around the code block.
Kit

On 11/05/2023 15:39, J. Gareth Moreton via fpc-devel wrote:
It does seem odd. In a practical sense, the only time I can see -1being a common input among other random numbers is if it's an errorvalue, in which case you would most likely do special handling ratherthan pass it through a division operation. With the slowdown thatcomes from additional branch prediction, it just seems likeunnecessary fluff, but I need to double-check to see if there's avery good reason behind their generation (if it's a platform-specificproblem, it should be moved to that platform's specific first pass) Now I just need to find out where those nodes are generated - they'reproving elusive!
Note that using constant divisors uses a different optimisation, sothis only applies to variable divisors.
Kit

On 11/05/2023 12:07, Stefan Glienke via fpc-devel wrote:
Looks like a rather disadvantageous way to avoid the idivinstruction because x div -1 = -x and x mod -1 = 0.
I ran a quick benchmark doing a lot of integer divisions wheresometimes (randomly) the divisor was -1. When the occurence was rareenough (~5%) the performance was not impacted, the higher theoccurence of -1 was the slower it became to almost half as fast.Only when less than 5% of the divisors were *not* -1 the performancewas better up to twice as fast when all divisors were -1. Of couseymmv as it depends on the CPU and the branch predictor behavior butit shows that this "optimization" is hardly any good.
I cannot think of a realistic case where 95% of your divisors are -1and you really need to save those few extra cycles of calling idiv.
On 11/05/2023 11:04 CEST J. Gareth Moreton via fpc-devel<fpc-devel@lists.freepascal.org> wrote:
  Hi everyone,

I need to ask a question about how division nodes are set up (I'm
looking at possible optimisation techniques).  I've written the
following procedure:

procedure DoDivMod(N, D: Integer; out Q, R: Integer);
begin
    Q := N div D;
    R := N mod D;
end;

Fairly simple and to the point.  However, even before the first node
pass, the following node tree is generated for an integer division
operation:

<statementn pos="24,10">
     <ifn resultdef="$void" pos="24,10" flags="nf_internal">
        <condition>
<equaln resultdef="Boolean" pos="24,10"flags="nf_internal">
              <loadn resultdef="LongInt" pos="24,14">
                 <symbol>D</symbol>
              </loadn>
<ordconstn resultdef="LongInt" pos="24,10"rangecheck="FALSE">
                 <value>-1</value>
              </ordconstn>
           </equaln>
        </condition>
        <then>
           <assignn resultdef="$void" pos="24,10" flags="nf_internal">
<temprefn resultdef="LongInt" pos="24,10"flags="nf_write"
id="$7C585E10">
                 <typedef>LongInt</typedef>
<tempflags>ti_may_be_in_reg</tempflags>
<temptype>tt_persistent</temptype>
              </temprefn>
              <unaryminusn resultdef="LongInt" pos="24,10">
                 <loadn resultdef="LongInt" pos="24,8">
                    <symbol>N</symbol>
                 </loadn>
              </unaryminusn>
           </assignn>
        </then>
        <else>
           <assignn resultdef="$void" pos="24,10" flags="nf_internal">
<temprefn resultdef="LongInt" pos="24,10"flags="nf_write"
id="$7C585E10">
                 <typedef>LongInt</typedef>
<tempflags>ti_may_be_in_reg</tempflags>
<temptype>tt_persistent</temptype>
              </temprefn>
              <divn resultdef="LongInt" pos="24,10">
                 <loadn resultdef="LongInt" pos="24,8">
                    <symbol>N</symbol>
                 </loadn>
                 <loadn resultdef="LongInt" pos="24,14">
                    <symbol>D</symbol>
                 </loadn>
              </divn>
           </assignn>
        </else>
     </ifn>
</statementn>
Something similar is made for "mod" as well. I have to askthough... is
it really necessary to check to see if the divisor is -1 and have a
distinct assignment for it? It's a bit of a rare edge case thatusually
just slows things down since it tends to add a comparison and a
conditional jump to the final assembly language.  Is there some
anomalous behaviour to a processor's division routine if thedivisor is -1?
At the very least, would it be possible to remove the conditionalcheck
when compiling under -Os?

(I intend to see if it's possible to merge "N div D" and "N mod D" on
x86, and possibly other processors that have a combined DIV/MODoperator).
Kit

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Division nodes

Reply via email to