[Mspgcc-users] msp430-elf size optimizations - some notes and a patch

DJ Delorie Wed, 05 Mar 2014 14:36:23 -0800

I'm writing this document to collect some of the current/new knowledge
on how to minimize the flash/rom size needed for MSP430 applications,
using the new msp430-elf (FSF/Red Hat) tools. I'll keep a copy at
http://people.redhat.com/~dj/msp430/size-optimizations.html


This document has two purposes: to collect this information in one
public place, and to ask others to test it out and provide feedback.

    Linker Section Optimization
    __int20 Patch for Large Model
    Building Newlib for Reduced Size

Obvious: Please do all size testing with -Os, which optimizes for
size, and not -O3, which optimizes for speed.

** Linker Section Optimizations

The msp430-elf assembler has a new directive ".refsym" that adds a
reference to the named symbol into the generated object. In the past,
to do this, you'd use a ".word" directive instead, but that takes up
space in the final image. This new directive takes up no space in the
image.

The startup code provided in the upstream newlib/libgloss (crt0 et al)
uses ".refsym" to tell the linker about dependencies between the
various snippets of startup code and things in the whole image which
require them. This dependency information is partly manual in crt0.S
itself, and partly automatic in code generated by gcc and gas.

For example, gcc has code to check if main() ever returns. In most
embedded programs, it won't. If it *does* return, gcc adds a ".refsym"
that causes a snippet in crt0.S to be included which adds a call to
exit() after the call to main(). If main() doesn't return, there won't
be any special code after the call to main().

The assembler has code to detect if either the data or bss sections
are used, and if they are, it will use .refsym to tell crt0 to pull in
snippets of code to initialize the RAM correctly. However, this means
that most objects will now have extra "undefined" symbols that aren't
part of your application:

        U __crt0_movedata

This new functionality means that there's a new library that must be
linked in: "-lcrt". This is built by libgloss (part of newlib) by
splitting up crt0.s, and contains all the snippets. To link this
library correctly, you need a special new section in your linker
script, which looks like this:

.text           :
  {
    . = ALIGN(2);
    PROVIDE (_start = .);
    KEEP (*(SORT(.crt_*)))
    *(.lowtext .text .stub .text.* .gnu.linkonce.t.* .text:*)

The keep/sort line places all the snippets from crt0 at that point
(after _start but before the rest of your program) in asciibetical
order. Conveniently, the sections in libcrt.a are all named like
".crt_0013something" so the four-digit number causes them to all be
inserted in the right order.

What's the net result of all this? A simple "blink an led" program
that has no global variables can take as few as 24 bytes of flash
depending on how you blink the led!

** __int20 Patch for Large Model

The second big change is some ongoing work to add true "__intN"
support to the GCC internals. Before now, gcc had one __int128 type
built-in and any target that wanted something else had to hack it in
somehow, without support from gcc's core. I've put a huge unofficial
patch online at:

http://people.redhat.com/~dj/msp430/int20-patch.txt

This patch may not apply cleanly if the upstream sources have changed
too much since the patch was generated.

There are two parts to this patch: The first part is the core __intN
support, and the second part is changes to the msp430 backend to
enable __int20 and support it as a regular integer type. Note that
this patch mostly affects "large model" programs (-mlarge) as it
changes pointer math to use __int20 for size_t instead of "unsigned
long".

To explicitly use the __int20 type, replace "int" with "__int20" like
this:

unsigned __int20 x[10];
extern __int20 a, b, c;
void foo (__int20 a, void *b);

Note that __int20 won't work (and you'll get a helpful compile-time
error) unless you are building for an MSP430X-class cpu. You are
allowed to use an explicit __int20 type with small model, though.

** Building Newlib for Reduced Size

In some cases, applications may want to use the stock newlib runtime
but want to reduce the amount of flash newlib routines use. If you're
willing to rebuild newlib yourself, there are some config options you
can provide that remove features you may not need. For an up-to-date
list of these options, run "./configure --help" in the newlib/
subdirectory. Any --enable-foo option can be given as --disable-foo to
disable a feature. For example:

../newlib-trunk/configure --disable-newlib-io-float

There is also an alternate tiny malloc() implementation that can be
enabled:

../newlib-trunk/configure --enable-newlib-nano-malloc

Note that you can specify multiple --enable/--disable options on one
configure command.


------------------------------------------------------------------------------
Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
_______________________________________________
Mspgcc-users mailing list
Mspgcc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mspgcc-users

[Mspgcc-users] msp430-elf size optimizations - some notes and a patch

Reply via email to