Andrew Collier wrote: > 16* uncontended t-states for an IN a,(n) or OUT (n),a; > 20** uts for an IN a,(c) or OUT (c),a. > > Question: Does SimCoupe currently use those values for the > instruction time?
Not quite so fixed as the position in the scanline can vary the timings by 4 t-states. I currently still have the raw timings for the instructions in the code (i.e. 11 for IN A,(n), 12 for OUT (C),r and 16 for OUTI) but then tweak the values as necessary by the display position and state. At some point I'll wrap the 'tstate' variable assignment macro to get the values to be compiled to be 4 t-states rounded to save doing it at run-time. The current implementation does 8 t-state instruction rounding for all parts of the mode 1 display, or just for the centre 256x192 block of the screen in the other modes when the screen is enabled. The remaining situations use the normal 4 t-state rounding. I treat mode 2 the same as modes 3 and 4, except the screen can't be disabled - am I right in thinking the timings are the same. For all the instructions that do port reads and writes there's an additional 4 t-states that are added for situations when the current scan position is not already 8 t-state aligned, and only for ports >= 0xe0 (from Si Cooke). Additional timings values I'm using that make a big difference for some software are: IM0/IM1 time = 8 t-states, IM2 = 16 t-states (rounded), interrupts active for 120 t-states and visible on the status port for 102 t-states (thanks for all those Ian!) and line interrupts are triggered at the start of the right-hand border area (line cycle position = 384-64 = 320). I've removed all of the original timing tweaks that were in the code for whatever reason as they only seemed to break things - I was hoping to do without any but we'll see how it goes. All these have help it survive various small timing tests done by various people, including the now infamous Defender loop ;-) Now I've implemented the display changes (border, palette and/or video mode) to instruction level it's possible to see how it copes with some of the SAM demos that rely on perfect timing. In general it seems to cope quite well, but there still seem to be some subtle timing issues that means the left-right positioning isn't quite right (not looked into yet). Effects in the border seem to run at the right speed, but ones on the main screen area run a little too fast. Here are some screenshots and problems (20-30K each): Mnemo demo 1, part 2 (http://www.obobo.demon.co.uk/mnemo1p2.jpg). The bottom border display stays synchronised correctly but it's left-right position isn't correct. The scroller on the main screen uses rapid VMPR switching, but the diagonal pattern shows it's running too fast so it's not lined up correctly, so some additional delay is needed. Mnemo demo 2, part 2 (http://www.obobo.demon.co.uk/mnemo2p2.jpg). Scroller lined up ok, but the right hand edge has a strange stretching effect for the edge of the character (and 'o' in this case) that's appearing. Mnemo demo 2, part 3 (http://www.obobo.demon.co.uk/mnemo2p3.jpg). Bars in bottom border jump left and right by 1 to 2 blocks worth, and the misalignment give additional lines that should probably be aligned with the bars (?). [it doesn't just run at 11fps as the title says, I'd just unpaused it and it hadn't settled]. E-Tunes demo (http://www.obobo.demon.co.uk/e-tunes.jpg). VMPR switched scroller now visible, tho the start position is slightly to the right and it's also a couple of blocks too short. Lyra 3 (http://www.obobo.demon.co.uk/lyra3.jpg). Top scroller seems to run fine, and bottom static image is shifted to the right. The other visual artifacts I used to see (where the screen should be disabled to hide stuff?) are no longer visible :-) The current SimCoupe doesn't seem to show enough of the border areas (mainly top and bottom) to show all of the border effects, so it might be worth having windowed modes show more of them (possibly optionally). 'ESI' in Lyra 3 does look rather like 'FST' :-) > Summary: > OUT (n),a OUT (c),a IN a,(n) IN a,(c) > VMPR 16 * 20 ** 16 * 20 ** Those values fit the case when the scan position isn't 8 t-state aligned, since the extra 4 t-states is being added to both (of course, there's no 8 t-state rounding to consider in your tests). If you put a NOP before the other instructions I'd expect you to get the same timing result, as the NOP will add 4 t-states but there won't be an additional 4 t-states to add for the alignment - would you please be able to try it? I've noticed that some places where the video timing isn't quite right seems to involve DJNZ for tight delay loops. The width of the scroller section used by the E-Tunes demo is mainly just one such loop. Is there anything special about DJNZ in terms of timing that could cause it to be too fast? I'm also starting to wonder about instructions lying across the boundaries where memory contention is introduced, as the subtle timings might make a difference, and that'd be difficult to implement. Another sub-instruction thing I've wondered about is whether I need to worry about which part of the instruction actually does the OUT that'll affect the video e.g. does it occur before the end of an OUTI? and if so, could it make a 1 block difference in some cases? (or am I just getting paranoid about timing?! No, don't answer that...) Comments and/or corrections on any of the above would be greatly appreciated! Best regards, Si ICQ: 9769343, Homepage: http://www.obobo.demon.co.uk/ P.S. I've just realised that the OTd(R)/INd(R) instruction timings won't be right since the extra 4 t-states is added calculated before the instruction time is added to the LineCycleCounter value. I assume the instruction takes 12 t-states, which is correct for the other types of I/O instructions but not for the block ones (which are 16). I'll have to correct that and see if it fixes any of the above tests...

