> The reworked version comes up with 110 bytes (still asserting MUL). Nicely done.
> perf-metering with avrtest reveals a run time from ~3100 up to < 4800 > ticks; high as expected. While mine is 3161 cycles worst case (64 ones), or 4045 if !MUL. So yours is actually not too unreasonable *if* the numbers are uniformly distributed. With more common distributions which obey Benford's law, of course, it's pretty awful speed-wise. I really wish I could find a way to skip the totally unnecessary final multiplication of 0 * 10, without adding one extra instruction. One slight speed saving: "Ten" is never overwritten anywhere in the function. You can load it once in the preamble and leave it. Or you could get rid of the Ten register entirely, save a spill/fill (106 bytes!) and "ldi r0,10" in the multiplication loop. _______________________________________________ AVR-libc-dev mailing list AVR-libc-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/avr-libc-dev