Thanks! I did some tests on my own, and the profiling says that the most slowest part is the text scroller. I already tried many LDI's and PUSH/POP system on my own, but unfortunately the process of changing SP makes PUSH/POP slower than sequence of LDI's. You know, the TASM doesn't support clever macro's so I can simply use REPEAT statement to compute LD SP,nn for each cycle of PUSH/POP. When using dynamic scheme, i.e. LD HL,16; ADD HL,SP; LD SP,HL; it's way too slow.
Push 16T and POP 12T? I believe what you say, but that Z80 timings document which I got from somebody here (last week) says PUSH 11T and POP 12T. So what's the reason of these differences? I assume that's something with memory contention on Sam, and probably all those Z80 instruction timings documents are pretty useless on Sam. Btw. Those assembler compilers I got from Sam users are not very good. If I understand it correctly, PYZ don't use ascii files. So I use TASM, but it uses nonstandard syntax (I hate it!), so I had to rewrite the original Comet source file (Comet uses quite good syntax - I think it's Zilog Z80 standard). Also, I extremely miss REPEAT and preprocessor variables, like Borland's TASM or Microsoft's MASM has on i8086 CPUs. Having the clever macros, it would be easy to write optimized memory copying code. (That stupid Z80 crossassembler TASM even can't include a binary file.) I would like to write something like: COUNTER = 16 REPEAT lines*128 LD SP,16 POP ... PUSH Counter = counter + 16 ENDM It's it great in it's simplicity? But it's not supported by those Z80 compilers. :-((( :-((( :-((( A. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Edwin Blink Sent: Sunday, January 30, 2005 2:44 PM To: sam-users@nvg.ntnu.no Subject: Re: Fastest memory transfer (on Sam) I need to move (video) memory on Sam, what's the fastest routine? I expect it must utilize push & pop instructions, since they take just 10T to move a byte, but I am unsure what is exact look of the whole routine. Please can somebody help me? There is not a single fastest routine for everything. To get the fastest routine it must be tailormade for it's task. For clearing or filling use PUSHes 16T per two bytes For scrolling use PUSH/POPS 28Ts per two bytes For copying use many LDIs 20 T per byte For sprite drawing use LD (HL),nn,INC L 16 T per byte Note that a PUSH qq takes16T and a POP qq takes 12T Edwin