On 03/09/2010 02:31, William "Chops" Westfield wrote:
On Sep 2, 2010, at 12:35 PM, Andres Vahter wrote:
mov.w R12,R13 ; The operand "input" in
register R12
rla.w R13
add.w R12,R13 ; X1=X*2^1+X
rla.w R13
rla.w R13
add.w R12,R13 ; X2=X1*2^2+X
rla.w R13
add.w R12,R13 ; X3=X2*2^1+X
rla.w R13
add.w R12,R13 ; X4=X4*2^1+X
rla.w R13
rla.w R13
rla.w R13
add.w R12,R13 ; Final Result=X5=X4*2^3+X
It computes (((X*2 + X)*4 + X)*2 + X)*2 + X)*8 + X
= (12X) + X)*2 + X)*2 + X)*8 + X
= 26X + X)*2 + X)*8 + X
= 54X + X)*8 + X
= 440X + X
= 441*X
Which is what it said at the top; multiplies 41 * 441.
It looks to me like a pretty standard multiplication algorithm, only
since one argument is a know constant, you get to leave out the steps
that would involve adding 0.
I've seen code generators for other microcontrollers that claim to
generate the optimal sequence for multiplying a register by any
constant. It should be possible to do for MSP430 too. Perhaps the C
compiler already does so? (probably not; I've also seen complaints
that gcc does a poor job of multiplying by constants.)
BillW
I've only got a somewhat older msp430 gcc compiler. But a quick test on
other gcc targets with newer gcc versions shows that gcc can generate
better code than this app note, of the form:
(((x << 3) - x) << 6) - ((x << 3) - x)
The ((x << 3) - x) is only calculated once, and the results are stored.
Thus you have a shift by 3, a subtract, a copy, a shift by 6, and
another subtract. The msp code would be something like:
mov.w r12, r13
rla.w r13
rla.w r13
rla.w r13
sub.w r12, r13
mov.w r13, r12
rla.w r12
rla.w r12
rla.w r12
rla.w r12
rla.w r12
rla.w r12
sub.w r13, r12
That's marginally smaller and faster than the app note version. For
other constants, using subtraction and re-using partial results can make
a significant difference.
This code (or at least, it's equivalent for the avr and the ColdFire)
was generated using "x = x * 441" - i.e., the compiler generates it.
So if the newer versions of msp430 gcc do the same job as other current
gcc ports, just write the C code properly and let the compiler figure
out the best way to implement it.
Note that you might have to use particular flags, such as -O2 rather
than -Os, to get this optimisation.