"those instructions should all take one execution cycle each" is likely
where your problem lies. Who says so? The STM8 is pipelined and given
"cycle" counts assume that the first decode cycle of an instruction
overlaps with the execution cycle of the previous instruction. This is
only _mostly_ the case. Sometimes you get decode stalls for various
reasons leading to an extra cycle being taken. There are two or three
stalls in your code that stand out but I'd guess it's the
read-after-write register stall implied by add/inc a followed by dec
(0x.., sp) coupled with a difference in how often the add/inc is skipped
over by the preceding jr that accounts for your difference.
Sadly the pipetrace functionality was removed from ucsim (why?) so if
you want something more accurate than the
no-stalls-and-everything-overlaps counts someone would have to work
through it by hand.
Mike
On 18/12/2022 03:16, Basil Hussain wrote:
I have a setup where I am using the timer facility in uCsim to
benchmark/profile the number of execution cycles of pieces of code. To
explain the setup briefly, I create a timer as well as a breakpoint on
writes to a GPIO port address, then with a breakpoint script, every
time it breaks I stop the timer, get its value, reset it to zero,
restart it, then continue sim execution. This allows me to bracket
sections of my code to be benchmarked just by toggling the relevant
GPIO port.
However, when recently looking at the results for two pieces of code
that should be identical in terms of number of execution cycles of the
assembly, I am actually seeing a discrepancy in the counted cycles.
Code 'A':
timer #0("benchmark") OFF 0.044375687499974 sec (710011 clks)
Code 'B':
timer #0("benchmark") OFF 0.045625625000010 sec (730010 clks)
I am at a loss to explain why there is a 20k cycle difference there.
One thing that does correlate is that 20k is a multiple of the number
of iterations of my benchmark testing, which is 10k. So something is
counting an extra 2 cycles per iteration within the code.
Here are the pertinent pieces of code in question that I am trying to
benchmark:
Code 'A':
_rotate_left_8:
ld a, (4 +1, sp)
and a, #0x07
ld (4 +1, sp), a
ld a, (4 +0, sp)
tnz (4 +1, sp)
jreq 0003$
0001$:
sll a
jrnc 0002$
inc a
0002$:
dec (4 +1, sp)
jrne 0001$
0003$:
retf
Code 'B':
_rotate_right_8:
ld a, (4 +1, sp)
and a, #0x07
ld (4 +1, sp), a
ld a, (4 +0, sp)
tnz (4 +1, sp)
jreq 0003$
0001$:
srl a
jrnc 0002$
add a, #0x80
0002$:
dec (4 +1, sp)
jrne 0001$
0003$:
retf
As you can see, the only differences are one "sll" vs "srl"
instruction, and one "inc" vs "add" - the rest is identical. And those
instructions should all take one execution cycle each. So there should
be no difference in the total number of execution cycles between the
two pieces of code.
I did think perhaps there may be a difference in benchmarking wrapper
code that runs the code above for a specified number of iterations.
This code is in C, so is at the mercy of SDCC's compilation for
consistency of execution, so maybe differences exist. But, having
checked that, I see no significant differences.
Benchmark wrapper assembly for code 'A':
bset 0x500a, #5
ldw x, #0x2710
00122$:
ldw y, x
decw x
tnzw y
jreq 00125$
pushw x
push #0x06
push _benchmark_rotate_val_8_65536_344+0
callf _rotate_left_8
addw sp, #2
popw x
jra 00122$
00125$:
bres 0x500a, #5
Benchmark wrapper assembly for code 'B':
bset 0x500a, #5
ldw x, #0x2710
00212$:
ldw y, x
decw x
tnzw y
jreq 00215$
pushw x
push #0x06
push _benchmark_rotate_val_8_65536_344+0
callf _rotate_right_8
addw sp, #2
popw x
jra 00212$
00215$:
bres 0x500a, #5
You can see they are identical apart from the labels.
So, where is uCsim getting the differences in measured cycles from? I
seem to recall that although cycle counts for some STM8 instructions
were incorrect in older SDCC releases, they had been corrected quite a
while ago - are some still incorrect? This is with uCsim 0.6.4 from
SDCC 4.2.0.
Regards,
Basil Hussain
_______________________________________________
Sdcc-user mailing list
Sdcc-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sdcc-user
_______________________________________________
Sdcc-user mailing list
Sdcc-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sdcc-user