Take-away:

> In the existing msp430-as, there is NO BEHAVIORAL DIFFERENCE between these
> two statements:

>  mov &extern, r15
>  mov extern, r15

> And that is flat out wrong.  If you have assembly code that depends on
> these being equivalent, it will break with the next release of mspgcc.  An
> ampersand will be required to produce absolute offsets.

On first glance, the two are indeed equivalent.
If this code gets linked the normal way, there is no difference between teh two 
statement.
However, the second one produces relocatable code, if (and only if) the target 
"extern"
is part of the relocated area.
if you move the code in the second line, it will will move the target address 
with it,
while the first line will still refer to the original address.

If this hasn't been handled this way by binutils, nobody will ahve noticed as 
long
as he wasn't trying to move the code from its linked position.

> While reviewing the binutils port, I've found a frighteningly large number
> of bugs in assembly and disassembly, especially in the 430X instructions
> and in addressing modes not normally produced by gcc.  

Indeed, I found some myself. I remember an occurrence where an MSP430X
MOV instruction didn't receive a target address at all (moving to 0x00000).
And some other oddities.
But I fear I have fixed this and did not keep the details (it was before
joining the community)

> Here's the situation: MSP430 supports two addressing modes that involve
> constant offsets: "symbolic" is PC-relative, and "absolute" is, well,
> absolute.  Symbolic is implemented by adding an offset to the value of r0
> (=pc); absolute by adding an offset to r2 (=cg1 configured to read as zero).

Yep. However, the two, if assembled and linked correctly, behave identically
as long as the code is executed at the address it was linked to.

The main usage for symbolic mode is to allow relocatable code with local 
constants, subroutines or targets.

Since compiler-generated code is usually not used as relocatable code,
it makes no difference which one the compiler produces.


> In assembly code the instruction:
> mov &0x1000, r15
> loads the word at address 0x1000 into r15.  This is absolute mode.

Yep.

> The similar instruction:
>  mov 0x1000, r15
> is in symbolic mode.  If the opcode for mov is at address 0x2000, then the
>word at address 0x3002 (i.e., 0x2002+0x1000) will be loaded into r15.  (The
> extra 2 is because pc was incremented after reading the opcode.)

No, that's wrong. 

This instruction actually moves the value of 0x1000 too.
However, the binary instruction to be generated in this case is
MOV (0x1000-$-2)(PC), r15

(for the -2, see below about changes in the X core)

Needing to know the relative address between current PC and target in 
assembly source would be simply an insane task. It would require to know
the length of all instrucitons and data fileds, inkcluding those that 
use the constant generator, between the symbolic instruction and the target,
making this mode practically unusable.
It's the job of the assembler to know the distance between the instruction and
the target.
And of course this mode makes no sense (but wouldn't technically hurt in most 
cases) if
used for a taget outside the current code unit.

> So: LTS-20110716, which has essentially the same binutils that's been in
> mspgcc for years, converts this:
> 
> .global extern
>        mov     &0x1000, r15
>        mov     0x1000, r15
>        mov     &extern, r15
>        mov     extern, r15
> into this:
> 00000000 <test>:
>   0:   1f 42 00 10     mov     &0x1000,r15
>   4:   1f 40 fa 0f     mov     0x0ffa, r15     ;PC rel. 0x01002
>   8:   1f 42 00 00     mov     &0x0000,r15
>                       a: R_MSP430_16  extern
>  c:   1f 40 00 00     mov     0x0000, r15     ;PC rel. 0x00010
>                       e: R_MSP430_16_PCREL_BYTE       extern

That's partly right. (see below)

> Absolute addressing mode is fine at this point, but symbolic mode has a
> couple flaws.  First, the specified value 0x1000 was improperly adjusted
> based on an assumption that the value would be stored at offset 6 (which it
> is at this point).  The result is that the address that would actually be
> read is 0x1000, rather than 0x1000+r0 which is what the instruction should
> have meant.  

No, that's exactly what the instruction meant. Take address 0x1000, but
calculate a relative offset to the current PC and store it in a symbolic 
mode instruction.

>This subverts the intent of symbolic addressing mode by making
> it effectively the same as absolute addressing mode.  

No, if the whole compilation unit is >4098 bytes, and is moved 
as a whole and without relocation to 0x2000, it will run unchanged. That's
what the symbolic mode is for.

> It's also wrong,
> because at this point the code is still relocatable and final address
> hasn't been determined: it probably won't be 6. 

Right, but if the whole block is relocated, then it still points to the correct
target address. Symbolic mode only makes sense if both, code and target,
have a fixed _relation_ (not position). Other than with absolute mode, 
the code can be moved together with the target, while absolute mode is 
used if the target is static and won't be moved with the code.
if the code isn't moved after linking, there is no difference at all

Code that has been written with symbolic mode for all constants can be freely
put anywhere inside the address range and will still work.

The flaw here is that the reference to extern isn't resolved at assemble time.
It could. Just like a 'JMP extern' could (which is a relative instruction too, 
in 
comparison to 'BR extern', actually emulated by a 'MOV #extern, PC' instruction)

However, it doesn't make any difference. If the linker locates extern @0x2000,
then the resulting instruction MOV (extern-2-PC)(PC), r15 still gets the 
exactly 
same binary representation.

> (Note the "PC rel" comment
> suggests the address that would be read is 0x1002; it's not, because of bug

Yes and no. Take a look at the errata sheets:

(from 5438 errata sheet) : CPU15 CPU Module 
Function Modifying the Program Counter (PC) behaves differently than in 
previous devices
Description When using instructions with immediate or indirect addressing mode 
to modify the PC, a
different value compared to previous devices must be added to get to the same
destination.
Example Previous device (MSP430F4619)
label_1 ADD.W #Branch1-label_1-4h,PC
MSP430F5438
label_1 ADD.W #Branch1-label_1-2h,PC
NOTE: The MOV instruction is not affected
Workaround
• Additional NOP after the PC-modifying instruction
or
• Change the offset value in software


You can see that on 'previous' (that means non-X) devices, the add/sub 
instructions, 
the calculation was done based on the PC at the moment of instruciton fetch, 
while on
MSP430X cores, the calculation is done based on the already incremented (for 
the fetch
of the immediate argument) PC.

It needs to be evaluated how this affects the interpretation (disassembly) 
of the binary instructions. maybe there has something mixed-up when generating 
the
comments - or in the generation of the binaries.


> Does your head hurt yet?  Mine does.)

It stops once one has a clear understanding of the purpose of the two 
addressing modes
and when they apply and when not.
I must admit that I was wondering about the two too when I read about them 
first time.
My first explanation whas 'well, they are possible, so why not using them?'
In fact, there are a few other possible addressign mode using the constant 
generator.
using R3 isntead of R2 as register woudl lead to an absolute mode whose 
addresses
are offset by 1. But I don't see any possible use for this other than a fancy
'high-byte addressing mode' that accesses the high-byte of the word that is 
located at
the given address. Not really useful (and identical to absolute mode for 
word instructions) ...

> Now, if you take that relocatable code and pass it through the linker with
> extern defined as 0x1000 and the text section starting at 0x6000, you get:

>   6000:       1f 42 00 10     mov     &0x1000,r15
>   6004:       1f 40 fa 0f     mov     0x0ffa, r15     ;PC rel. 0x07002
>   6008:       1f 42 00 10     mov     &0x1000,r15
>   600c:       1f 40 f2 af     mov     0xaff2, r15     ;PC rel. 0x01002

> Again, absolute mode is correct, and symbolic mode has made another attempt
> to convert the offset so that an absolute address is read.  Ignore the
> decoding error in the comment, because in practice the last two
> instructions would both read a word from offset 0x1000.

And here is indeed an error.
If the former address 0x1000 is part of the same relocated code unit, the first 
symbolic instruction is correct (since the target has move dwith the code).
If not, using symbolic mode didn't make any sense anyway.
However, the second symbolic instruciton is relocated wrongly.
it sould read 1f 40 f2 ff move 0xfff2, r15

I wonder about the second absolute lone, it should read 1f 42 00 60 mov 
$0x6000, r15.
I think that is a copy/paste error of yours?

> These errors will not be fixed in LTS-20110716.

> However, unless somebody can convince me this analysis is wrong (which is
> part of why I'm posting this), in the next development release of mspgcc
> the original code will produce:

>   6000:       1f 42 00 10     mov     &0x1000,r15
>   6004:       1f 40 00 10     mov     0x1000, r15     ;PC rel. 0x07006
>   6008:       1f 42 00 10     mov     &0x1000,r15
>   600c:       1f 40 00 10     mov     0x1000, r15     ;PC rel. 0x0700e

Which would make the symbolic mode useless, as nobody could use it without
some additional hands to count instruction sizes..

Again, symbolic mode only makes sense if the code does not need any linking.
(while it doesn't hurt if the linker aplies the relocation, as long as it is 
done correctly)

Code using symbolic mode can be moved around freely together with any target 
addressed in symbolic mode.
A library whose internal function calls are written in symbolic mode can be 
loaded
anywhere at runtime. no need to link it at a fixed location.
If it too only uses local variables and no static/global ones, it doe snot need 
to be 
linked into the project at all. It can be loaded from any storage to a random
memory location and execute there and it will work.
it is, however, not an easy task since symbolic mode cannot be used to
get a relative address as value, as required for a direct use as call or push 
target.

So the main usage for symbolic mode (sice numerical constants are better
placed inside the instructions) is using it for relocation tables.
Load code to somewhere, and add the start addres sof the whole module to the
entries of a relocation table placed at the beginnig.
Then the whole module can access its constants and functions through this
table without knowing where it was finally moved to.
And the external code could do the same (sort of library entry points).
No need to change every address inside the module.
It adds an additional cacle for loading the target addresses, but well, 
no comfort without costs.


> For the most part, the gcc port does not emit code that uses symbolic mode,
> so these bugs haven't been affecting it.

Indeed, the purpose of symbolic mode is not suitable for a normal 
compiler/linker
combo. But then, the MSP wasn't originally designed with a specific compiler or 
linker in mind. :)

> There are a couple cases where it does, and those will have to be fixed too.  

That's surprising. Where?

>But it's assembly-language
>coders who will have to fix their code if they've been using symbolic mode
>where they should have been using absolute mode.

No, in the very most cases, the two modes are interchangeable.
The only difference is that if you want to move the code after linking,
you'll need to use absolute mode for targets outside the moved code block
and symbolic mode for targets inside the moved code block.

If everything is linked to and used at a static execution address, 
the two are freely exchangeable.

JMGross

p.s.: now _I_ have a headache - and spent a hour more on this than I
originally would: my office hours are long over.
And I won't put my hands into the fire for every detail thought that I wrote 
above.
Just that a fixe may be neccessary, but the proposed one is the wrong 
direction, as it
would break the only use that is there for symbolic mode at all.


------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Mspgcc-users mailing list
Mspgcc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mspgcc-users

Reply via email to