Georg-Johann Lay <a...@gjlay.de> wrote: > The algo is rather slow because it always iterates over all > digits, i.e. it won't run faster for small numbers. > > Have fun! > > Code size is ~140 bytes.
Well, it's bigger (140 > 124), slower, and doesn't handle sizes *other* than 64 bits, so that's not terribly useful. I think you could shrink it a bit, replacing these 16 instructions of messy digit output code (why are you looping incrementing DIGIT2 when you know it is never more than 1?): clr DIGIT2 1: inc DIGIT2 subi DIGIT, 10 brcc 1b brts 2f ;; T = 0 is the first round. Output the high digit if it's not '0'. set subi DIGIT2, 1-'0' ;; Initialize nonZero. We only output digits if we saw a digit != '0'. mov nonZero, DIGIT2 cpi nonZero, '0' breq 2f st X+, DIGIT2 2: ;; Output digits except the highest (except that for 10^19). subi DIGIT, -10-'0' or nonZero, DIGIT ;; We only output digits if we saw a digit != '0', i.e. strip leading '0's. cpi nonZero, '0' breq 3f st X+, DIGIT With these 9 instructions: cpi DIGIT, 10 ;; First "digit" can be as high as 18 brcs 2f ldi nonZero, '1' ;; '1' is non-zero, which is perfect st X+, nonZero subi DIGIT, 10 2: or nonZero, DIGIT breq 3f ;; Don't print leading zeros subi DIGIT, -10-'0' st X+, DIGIT 3: With this, you can also delete the leading clt. It eliminates DIGIT2, but unfortunately that doesn't save a spill. You also have to adapt the final "lone zero" printing code to print if nonZero == 0, but that's the same size. Also, this is just silly: dec Count cpse Count, Zero rjmp .Loop "dec" sets the zero flag, so that can just be "dec Count $ brnz .Loop". And finally, your multiply loop is wasting two instructions: mul A0,Ten $ mov A0,r0 $ add A0,Cy $ mov Cy,r1 $ adc Cy,Zero mov __tmp_reg__,A0 mov A0,A1 $ mov A1,A2 $ mov A2,A3 $ mov A3,A4 mov A4,A5 $ mov A5,A6 $ mov A6,A7 $ mov A7,__tmp_reg__ "mov A0,r0" and "mov __tmp_reg__,A0" are cancelling each other out and should both be deleted (with the "A0 += Cy" adjusted to add to r0, of course). Just make it: mul A0,Ten $ mov A0,r0 $ add r0,Cy $ adc r1,Zero $ mov Cy,r1 mov A0,A1 $ mov A1,A2 $ mov A2,A3 $ mov A3,A4 mov A4,A5 $ mov A5,A6 $ mov A6,A7 $ mov A7,r0 That saves 22 bytes, leaving it 6 bytes smaller than mine. Nice to have available! _______________________________________________ AVR-libc-dev mailing list AVR-libc-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/avr-libc-dev