Looks like a rather disadvantageous way to avoid the idiv instruction because x div -1 = -x and x mod -1 = 0.
I ran a quick benchmark doing a lot of integer divisions where sometimes (randomly) the divisor was -1. When the occurence was rare enough (~5%) the performance was not impacted, the higher the occurence of -1 was the slower it became to almost half as fast. Only when less than 5% of the divisors were *not* -1 the performance was better up to twice as fast when all divisors were -1. Of couse ymmv as it depends on the CPU and the branch predictor behavior but it shows that this "optimization" is hardly any good. I cannot think of a realistic case where 95% of your divisors are -1 and you really need to save those few extra cycles of calling idiv. > On 11/05/2023 11:04 CEST J. Gareth Moreton via fpc-devel > <fpc-devel@lists.freepascal.org> wrote: > > > Hi everyone, > > I need to ask a question about how division nodes are set up (I'm > looking at possible optimisation techniques). I've written the > following procedure: > > procedure DoDivMod(N, D: Integer; out Q, R: Integer); > begin > Q := N div D; > R := N mod D; > end; > > Fairly simple and to the point. However, even before the first node > pass, the following node tree is generated for an integer division > operation: > > <statementn pos="24,10"> > <ifn resultdef="$void" pos="24,10" flags="nf_internal"> > <condition> > <equaln resultdef="Boolean" pos="24,10" flags="nf_internal"> > <loadn resultdef="LongInt" pos="24,14"> > <symbol>D</symbol> > </loadn> > <ordconstn resultdef="LongInt" pos="24,10" rangecheck="FALSE"> > <value>-1</value> > </ordconstn> > </equaln> > </condition> > <then> > <assignn resultdef="$void" pos="24,10" flags="nf_internal"> > <temprefn resultdef="LongInt" pos="24,10" flags="nf_write" > id="$7C585E10"> > <typedef>LongInt</typedef> > <tempflags>ti_may_be_in_reg</tempflags> > <temptype>tt_persistent</temptype> > </temprefn> > <unaryminusn resultdef="LongInt" pos="24,10"> > <loadn resultdef="LongInt" pos="24,8"> > <symbol>N</symbol> > </loadn> > </unaryminusn> > </assignn> > </then> > <else> > <assignn resultdef="$void" pos="24,10" flags="nf_internal"> > <temprefn resultdef="LongInt" pos="24,10" flags="nf_write" > id="$7C585E10"> > <typedef>LongInt</typedef> > <tempflags>ti_may_be_in_reg</tempflags> > <temptype>tt_persistent</temptype> > </temprefn> > <divn resultdef="LongInt" pos="24,10"> > <loadn resultdef="LongInt" pos="24,8"> > <symbol>N</symbol> > </loadn> > <loadn resultdef="LongInt" pos="24,14"> > <symbol>D</symbol> > </loadn> > </divn> > </assignn> > </else> > </ifn> > </statementn> > > Something similar is made for "mod" as well. I have to ask though... is > it really necessary to check to see if the divisor is -1 and have a > distinct assignment for it? It's a bit of a rare edge case that usually > just slows things down since it tends to add a comparison and a > conditional jump to the final assembly language. Is there some > anomalous behaviour to a processor's division routine if the divisor is -1? > > At the very least, would it be possible to remove the conditional check > when compiling under -Os? > > (I intend to see if it's possible to merge "N div D" and "N mod D" on > x86, and possibly other processors that have a combined DIV/MOD operator). > > Kit > > _______________________________________________ > fpc-devel maillist - fpc-devel@lists.freepascal.org > https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel