I'm using a variant on that page's square table method for my 10 bit
multiplies (though I always do (a + b)^2 - (a - b)^2), and something a
lot like his divide. But my divide is a lot messier. Specifically (in
pyz80 syntax):
;
; FIXDIV, works out BC / DE and stores it to HL (all 8.8 fixed point)
;
; clobbers: a
;
FIXDIV:
; check signs
ld a, b
and a
jp p, @+check2
; bc is negative
ld a, d
and a
jp p, @+divnegpos
; if we get to here then both are negative. Easy...
ld hl, 0
sbc hl, de
ex de, hl
ld hl, 0
and a
sbc hl, bc
ld b, h
ld c, l
jr @+divpospos
@check2:
; bc is positive
ld a, d
and a
jp m, @+divposneg ;jump if de is negative too
@divpospos:
ld hl, 0
@DIVLOOP1: EQU FOR 8
sla b
rl l ; lb <<= 1
sbc hl, de ; find hl - de (nb: carry is definitely clear, so sbc is
equivalent to sub)
jr nc, @+noadd ; if hl was larger than de then obviously it was
wrong as it's the remainder
add hl, de ; oh, hl was smaller - so add de back in again
@noadd:
NEXT @DIVLOOP1
; get to here and b is the lowest byte of the answer, which we
just throw away
dtpos:
@DIVLOOP2: EQU FOR 8
sl1 c
adc hl, hl ; hlc = (hlc << 1) | 1
sbc hl, de ; find hl - de (carry definitely clear)
jr nc, @+noadd ; if hl was is larger than de then obviously it was
wrong as it's the remainder
add hl, de ; oh, hl was smaller than de - so add de back
in again
dec c ; remove bottom bit of c
@noadd:
NEXT @DIVLOOP2
; c is now the middle byte of the answer ...
@DIVLOOP3: EQU FOR 8
sl1 b
add hl, hl ; double answer - NB carry from sl1 b ignored, so it's as
though b = 0 at start of loop
and a ; need to clear carry - add hl, hl may overflow from here on
out
sbc hl, de ; find hl - de (sub hl, de would be better here)
jr nc, @+noadd ; if hl was is larger than de then obviously it was
wrong as it's the remainder
add hl, de ; oh, hl was smaller than de - so add de back in again
and remove bottom bit of c
dec b ; remove bottom bit of c
@noadd:
NEXT @DIVLOOP3
; ... and b is low byte of the answer
ld h, c ; put b and c into hl, but the other way round
ld l, b
dlend: RET
@divnegpos:
; negate bc
ld hl, 0
and a
sbc hl, bc
ld b, h
ld c, l
call @-divpospos
ex de, hl
ld hl, 0
and a
sbc hl, de
ret
@divposneg:
; negate de
ld hl, 0
and a
sbc hl, de
ex hl, de
call @-divpospos
ex de, hl
ld hl, 0
and a
sbc hl, de
ret
On 19 May 2008, at 08:00, ellvis wrote:
Hi,
I am not really a coder, but you can always try to look at http://baze.au.com/misc/z80bits.html
It's a collection of z80 math routines, all commented, it's done by
Baze/3SC - Spectrum demo coder.
Just for an inspiration. Anyway, it's really great to read such a
stuff here on the list, pretty inspirative!
ellvis
Chris Pile wrote:
Re: code, I really don't think that cycle pinching is the main
problem — I think it's algorithmic. I'm going to need to think
about that a bit more. At the minute I'm a bit indoctrinated as
to how I think a 3d pipeline should go, and I may need to rethink
that.
Sometimes you have to take the unconventional route when optimising
for "old-skool"
hardware! Textbook methods are fine if you have the horsepower -
if not, then you're
going to have to look at pulling out all the tricks you can. :-)
--
ellvis/ZeroTeam
web: http://zeroteam.sk
e-mail: [EMAIL PROTECTED]
Sinclair ZX Spectrum support since 1996