If the code breaks down to "x div 10" (with 10 as an
explicit constant and x being a 64-bit integer), then yes it will. If
the divisor is a variable though, then it won't. However, the "magic
div" algorithm only works for numerators up to a given bit size; if the
number
Hi,
in my big decimal unit, I need to divide by 10 all the time,
e.g.
https://github.com/benibela/bigdecimalmath/blob/master/bigdecimalmath.pas#L1324-L1325
would the magic div help there much?
Bye,
Benito
On 09.11.21 22:12, J. Gareth Moreton via fpc-devel wrote:
This one for
This one for Marģers specifically,
You'll be pleased to know that your insight has been partially implemented!
https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/51
This only expands 32-bit divisions to 64-bit, since the smaller sizes
requires more work at the node level, but is
not detect word and byte size on division
by constant? Squeezing in 32 bit register it would make byte-code
shorter, not necessarily faster.
- Reply to message -
*Subject: *Re: [fpc-devel] The "magic div" algorithm
*From: * J. Gareth Moreton via fpc-devel
<mailto:fpc-dev
Hi,
Thank you for implementation.
Is that true, that you can not detect word and byte size on division by constant? Squeezing in 32 bit register it would make byte-code shorter, not necessarily faster.
- Reply to message -
Subject: Re: [fpc-devel] The "magic div" algorith
This one is for Marģers especially:
https://gitlab.com/freepascal.org/fpc/source/-/issues/39355
Gareth aka. Kit
--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
___
fpc-devel maillist -
Fixed the problem. I was deallocating a register too soon, so it got
overwritten in some cases while still in use. Running tests again.
Gareth aka. Kit
--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
Div by 3 has a very elegant implementation. In 32-bit, it's
movl $0xAAAB,%eax
mull Input ; answer is in %edx:%eax
shrl $1,%edx
movl %edx, Result
As Marģers discovered, you can reduce the cycle count and minimise
pipeline stalls by extending to 64-bit and applying the shift to
Very interesting, Gareth !
Is the div-by-7 related to 2 to the 3rd ? If yes, is it possible to design a
div-by-3 with similar magics ?
Cheers, Ched'
Le 03.09.21 à 15:35, J. Gareth Moreton via fpc-devel a écrit :
Hey Marģers,
So I've been experimenting with your suggestion, and it looks
Hey Marģers,
So I've been experimenting with your suggestion, and it looks like a
resounding success! I added some new tests to the "bdiv" bench test to
see how it performs. 16-bit multiplications don't get improved as well
as they could be on x86_64 because the intermediate values are all
Well, you'll be pleased to know that I am now back from my camping and
hiking trip in the Lake District and have actually purchased "Hacker's
Delight", and I have to say it's extremely insightful. I would love to
play around and implement some of the suggested optimisations.
Some of them will
I came up with even shorter variant of div
example
function teDWordDivBy7_v4( divided : dword):dword; assembler; nostackframe;
asm
mov ecx,divided
mov rax,2635249153693862181
mul rcx
mov eax,edx
end;
current version for comparison
function teDWordDivBy7_v0( divided :
teByteDivBy7( divided : dword):dword; assembler; nostackframe;
asm
mov ecx,divided
mov eax,293
mul ecx
shr eax, 11
end;
- Reply to message -
Subject: Re: [fpc-devel] The "magic div" algorithm
From: Marģers . via fpc-devel
To: FPC developers' list
For unsigned
.
For unsigned byte, word and dword divisions by constant on 64 bit cpu can be converted as good cases.
Here is possibility for improvements.
- Reply to message -
Subject: Re: [fpc-devel] The "magic div" algorithm
From: J. Gareth Moreton via fpc-devel
To:
Something tells m
Something tells me I should purchase that book - I sense it could reveal
some interesting insights.
Note that while I understand the concept of turning integer division
into multiplication (indeed, I implemented the first version into x86
before it was improved with the
is there a reference to the algorithm that's used to calculate the
reciprocal constants used in the integer division optimisations for x86 and AArch64?
Hacker’s Delight
Second Edition
Henry S. Warren, Jr.
___
fpc-devel maillist -
Hi everyone,
I know I was told what the algorithm was already, but I can't find it in my
email history, so I'm
forced to ask again... is there a reference to the algorithm that's used to
calculate the
reciprocal constants used in the integer division optimisations for x86 and
AArch64? It's
17 matches
Mail list logo