CPU clock is 6MHz, but when you compare real speed of computer with ZX
Spectrum, you can't
see much difference. Since ZXS has standard PAL 3.54MHz clock, it should be
much much slower
then Sam. But it isn't. And I ask why? Here are some speculations from my
head. ;-) I hope
that Si Owen and other gurus will read it and give more light to it.

To see all consequences, I must start with ZX Spectrum timing values.
ZX Spectrum timing is well known: It draws 312 lines of 224T each frame,
from which 256x192
is paper, and the rest is border. This is beacuse single line of 256 pixels
is known to last
exactly 128T. Left and right border is together 96T, which makes it 224T per
line. If you
sometimes programm(ed) on real ZX Spectrum, you know that CPU is slowered
when operating in
paper area. More exactly, when ULA is generating paper picture, CPU can
access videoram only
once per 8T. Videoram is lower 16KB RAM, which is connected directly to ULA
and is used to
store video data (only 6912 bytes). ULA needs 2 bytes to display 8 pixels=4T
(obviously), so
it reads cca. 1 byte each odd T. Actually, this doesn't make sense to me,
since I can't see
any reason for "CPU can read 1 byte per 8T".

The fastest way of drawing on screen is to use PUSH insctructions, which are
1 byte, 11T.
When running this in higher (fast) RAM, it results in two write-byte
requests to videoram
each 11T. Since CPU cannot write this any sooner than in 16T, it is halted
for additional 5T.
Other possibility of writing to the screen is LD (nn),DE which takes 20T, so
it is not slowed.
Not many instructions seem to be really slowered, except you put whole your
program into
videoram (i.e. in space 16KB-32KB). Then CPU has really bad days...

Another black hole is whole timing. I can't see where we lost almost 50000T.
If you multiply
312*224*50, you get 3494400T per frame, which is some 50000T less then PAL
clock. This either
means that real frame rate is 1% faster (giving 50.5Hz), or the rest T
states are spent in
vertical retrace? I really haven't heard about this yet. Somebody may think
it is obvious,
but I really don't know.

With having all this in mind, let's go to Sam.
Sam has 512KB videoram, i.e. it can be slowed by ASIC chip almost anytime.
ASIC is generating
PAL picture, so I bet it has 3.54MHz clock as well. Better said it must have
something like
14.28MHz to be able to draw mode 3 screen, which is 512x192. Since the
resulting picture must
comply with PAL standards (otherwise it wouldn't be possible to display it
on a common TV),
the clock must be pretty exact. I mean, whether it is 14.28 or not, it can
be a multiply of
6MHz CPU clock. This shouldn't be a problem unless we want to know how
exactly is CPU slowered
by ASIC.

When you set SOFF bit to turn screen off, no paper is generated, so the CPU
runs at 6MHz.
As far as I know some people observed this and realised, that it is still
less than 6MHz,
but this is quite harmless for us, since we are searching for some 1.5MHz
speed loss, not
some kiloT per sec... ;-)
When running "Fidzi Speed Test" program in Win32 SimCoupe, it shows very bad
result:
4.883MHz in RAM, 5.561MHz in ROM. This can be hardly understand as "good
timing". When I run
this program on real Sam, I get cca. 6MHz when screen is off (what isn't
emulated at all,
and I see weird screen artifacts instead), and 4.8MHz (mode 4) or 4.1MHz
(mode 1) otherwise.
It seems that 4.8MHz is a magic value, since emulator always run at this
speed. I spent
plenty of years with making ZXS emulators for Sam, so I obviously consider
very interesting
why there's so large speed decrease in mode 1. Although mode 1 is much
easier to generate,
CPU is slowered quite heavily (6.0 -> 4.1 = 68%!). This makes possible
operations like
loading ZX tapes, etc. Obviously, there's something deeplyu hidden which
makes it impossible
to use SAVE, since in that time CPU is back on its 4.8MHz. I know this,
since when I
multiplied timing values in ZX-ROM by 4.8/3.5, I get very great results, and
can save
anything absolutely perfectly now. (The exact values can be found in my
emulators.)

This beggins to be hard understandable, and we even haven't started yet...
Since ASIC draws its 256x192 picture similar way to ULA, it probably slowers
CPU down
in paper area similarly to ULA. Actually this "probably" word should be
replace by "surely
not", since this would make tape operations impossible. When you load ZX-ROM
into Sam's RAM
and run it, you can see, that the speed is different, but it is still
CONSTANT. So there's
something even less smart, what makes Sam's CPU run at 4.8MHz instead of
6MHz. Somebody
wrote a year ago that CPU is probably aligned to 8T states per instruction
or what. This
doesn't make sense, because the number of bytes each instruction needs may
differ (e.g. LD
A,B needs 1 byte (opcode) and would be 8T, while LD A,(HL) needs 2 bytes
(opcode+data) and
would be 16T rather than 8T). Z80 docs say that both named instructions
should be under 8T.
But that's not enough! The problem is "when is CPU halted". If CPU requrests
its byte
from memory at the begin of that 8T time, it will wait a much longer,
compared to
a situation when CPU requires a byte at the end of that 8T time. Actually I
don't know
exactly when is "begin" or "end" of that 8T cycte. Or even if there's any.

Also, since CPU clock is 6MHz, it's more likely to be halted different way
than ZX Spectrum.
But this is a whole speculation, since it could possibly use faster RAM,
which make it
possible to meet 8T principle, as known on ZX Spectrum. Since ASIC must draw
picture, it
has privileged access to videoram (Sam has no other RAM than videoram!), and
it must decide
when CPU will be awarded with a memory access. ASIC doesn't share CPU's
clock, since it
must run at PAL clock. This applies to memory accesses by ASIC, i.e.
regardless real clock
of system bus, ASIC needs to read videoram at PAL speed. At worst (mode 4)
it must read
1 byte to draw 2 pixels. This results in average rate of memory accesses of
282ns (nano
seconds). Sam memory is known to be 100ns, which equals to 10MHz system bus
(theoretically).
Since I am not a hardware expert, I know nothing about system busses, and
whether it is
possible to have two chips with different external clock to share the same
bus. What gives
more sense is that ASIC definitely drives memory on its own, and CPU is its
slave. This
means that we should know more about ASIC's design to say how its done.

Although I don't understand how does it work, I'm sure that we are facing a
very terrible
computer design, which has underdimensed system bus, resulting in a large
speed loss (upto
32%). This is similar to modern PC computers with Pentium 3 running at
1000MHz, and bus
running at 100MHz. I think the best solution on Sam would be circuitry with
a separated
64KB videoram and 512KB other ram, where CPU would run at its 6MHz. Then,
when emulating
ZXS on Sam, we would put all this stuff into videoram, so the CPU would be
regularly
slowered to make ZXS emulation better (although still not perfect).

I'm awaiting your comments!

Aley Keprt

Reply via email to