CPU clock is 6MHz, but when you compare real speed of computer with ZX Spectrum, you can't see much difference. Since ZXS has standard PAL 3.54MHz clock, it should be much much slower then Sam. But it isn't. And I ask why? Here are some speculations from my head. ;-) I hope that Si Owen and other gurus will read it and give more light to it.
To see all consequences, I must start with ZX Spectrum timing values. ZX Spectrum timing is well known: It draws 312 lines of 224T each frame, from which 256x192 is paper, and the rest is border. This is beacuse single line of 256 pixels is known to last exactly 128T. Left and right border is together 96T, which makes it 224T per line. If you sometimes programm(ed) on real ZX Spectrum, you know that CPU is slowered when operating in paper area. More exactly, when ULA is generating paper picture, CPU can access videoram only once per 8T. Videoram is lower 16KB RAM, which is connected directly to ULA and is used to store video data (only 6912 bytes). ULA needs 2 bytes to display 8 pixels=4T (obviously), so it reads cca. 1 byte each odd T. Actually, this doesn't make sense to me, since I can't see any reason for "CPU can read 1 byte per 8T". The fastest way of drawing on screen is to use PUSH insctructions, which are 1 byte, 11T. When running this in higher (fast) RAM, it results in two write-byte requests to videoram each 11T. Since CPU cannot write this any sooner than in 16T, it is halted for additional 5T. Other possibility of writing to the screen is LD (nn),DE which takes 20T, so it is not slowed. Not many instructions seem to be really slowered, except you put whole your program into videoram (i.e. in space 16KB-32KB). Then CPU has really bad days... Another black hole is whole timing. I can't see where we lost almost 50000T. If you multiply 312*224*50, you get 3494400T per frame, which is some 50000T less then PAL clock. This either means that real frame rate is 1% faster (giving 50.5Hz), or the rest T states are spent in vertical retrace? I really haven't heard about this yet. Somebody may think it is obvious, but I really don't know. With having all this in mind, let's go to Sam. Sam has 512KB videoram, i.e. it can be slowed by ASIC chip almost anytime. ASIC is generating PAL picture, so I bet it has 3.54MHz clock as well. Better said it must have something like 14.28MHz to be able to draw mode 3 screen, which is 512x192. Since the resulting picture must comply with PAL standards (otherwise it wouldn't be possible to display it on a common TV), the clock must be pretty exact. I mean, whether it is 14.28 or not, it can be a multiply of 6MHz CPU clock. This shouldn't be a problem unless we want to know how exactly is CPU slowered by ASIC. When you set SOFF bit to turn screen off, no paper is generated, so the CPU runs at 6MHz. As far as I know some people observed this and realised, that it is still less than 6MHz, but this is quite harmless for us, since we are searching for some 1.5MHz speed loss, not some kiloT per sec... ;-) When running "Fidzi Speed Test" program in Win32 SimCoupe, it shows very bad result: 4.883MHz in RAM, 5.561MHz in ROM. This can be hardly understand as "good timing". When I run this program on real Sam, I get cca. 6MHz when screen is off (what isn't emulated at all, and I see weird screen artifacts instead), and 4.8MHz (mode 4) or 4.1MHz (mode 1) otherwise. It seems that 4.8MHz is a magic value, since emulator always run at this speed. I spent plenty of years with making ZXS emulators for Sam, so I obviously consider very interesting why there's so large speed decrease in mode 1. Although mode 1 is much easier to generate, CPU is slowered quite heavily (6.0 -> 4.1 = 68%!). This makes possible operations like loading ZX tapes, etc. Obviously, there's something deeplyu hidden which makes it impossible to use SAVE, since in that time CPU is back on its 4.8MHz. I know this, since when I multiplied timing values in ZX-ROM by 4.8/3.5, I get very great results, and can save anything absolutely perfectly now. (The exact values can be found in my emulators.) This beggins to be hard understandable, and we even haven't started yet... Since ASIC draws its 256x192 picture similar way to ULA, it probably slowers CPU down in paper area similarly to ULA. Actually this "probably" word should be replace by "surely not", since this would make tape operations impossible. When you load ZX-ROM into Sam's RAM and run it, you can see, that the speed is different, but it is still CONSTANT. So there's something even less smart, what makes Sam's CPU run at 4.8MHz instead of 6MHz. Somebody wrote a year ago that CPU is probably aligned to 8T states per instruction or what. This doesn't make sense, because the number of bytes each instruction needs may differ (e.g. LD A,B needs 1 byte (opcode) and would be 8T, while LD A,(HL) needs 2 bytes (opcode+data) and would be 16T rather than 8T). Z80 docs say that both named instructions should be under 8T. But that's not enough! The problem is "when is CPU halted". If CPU requrests its byte from memory at the begin of that 8T time, it will wait a much longer, compared to a situation when CPU requires a byte at the end of that 8T time. Actually I don't know exactly when is "begin" or "end" of that 8T cycte. Or even if there's any. Also, since CPU clock is 6MHz, it's more likely to be halted different way than ZX Spectrum. But this is a whole speculation, since it could possibly use faster RAM, which make it possible to meet 8T principle, as known on ZX Spectrum. Since ASIC must draw picture, it has privileged access to videoram (Sam has no other RAM than videoram!), and it must decide when CPU will be awarded with a memory access. ASIC doesn't share CPU's clock, since it must run at PAL clock. This applies to memory accesses by ASIC, i.e. regardless real clock of system bus, ASIC needs to read videoram at PAL speed. At worst (mode 4) it must read 1 byte to draw 2 pixels. This results in average rate of memory accesses of 282ns (nano seconds). Sam memory is known to be 100ns, which equals to 10MHz system bus (theoretically). Since I am not a hardware expert, I know nothing about system busses, and whether it is possible to have two chips with different external clock to share the same bus. What gives more sense is that ASIC definitely drives memory on its own, and CPU is its slave. This means that we should know more about ASIC's design to say how its done. Although I don't understand how does it work, I'm sure that we are facing a very terrible computer design, which has underdimensed system bus, resulting in a large speed loss (upto 32%). This is similar to modern PC computers with Pentium 3 running at 1000MHz, and bus running at 100MHz. I think the best solution on Sam would be circuitry with a separated 64KB videoram and 512KB other ram, where CPU would run at its 6MHz. Then, when emulating ZXS on Sam, we would put all this stuff into videoram, so the CPU would be regularly slowered to make ZXS emulation better (although still not perfect). I'm awaiting your comments! Aley Keprt

