On Fri, Apr 27, 2012 at 8:42 AM, David Brown <da...@westcontrol.com> wrote:
> Hi,
>
> One possible idea is that since r12-r15 are caller-saves rather than callee
> saves, then it might be possible to restrict 20bit usage to those registers.
>  Then it is the caller's responsibility to save the register over function
> calls, and the callee doesn't have to have __sr20__.  Four 20-bit registers
> could be restrictive if you are doing a lot of 20-bit arithmetic, but it
> should be good enough for most data accesses or far function calls.  The
> result is perhaps more efficient if you want to use mostly 16-bit code, but
> occasionally need more data range.

Eh, maybe.  Again, let's see how big an issue it is in reality before
complicating stuff.

> I note that this all takes about 20-bit registers, addresses, attributes,
> etc.  I have a slight fear that once you've got everything working, TI will
> extend the range to 24 bits (perhaps to keep up with the AVR XMEGA that
> "supports" 16MB address ranges).  Will you then have to start from scratch,
> or can you re-use much of the code?

It's pretty clean; the only bobble is that GCC only has one named
PSImode "partial standard int", and having two might be a bit
complicated.  But it could be done.

I can't see why anybody want to form a product based on such a beast
though; a low-end ARM is a much cleaner architecture, already exists
with a lot of commercial buy-in, and as I understand it is getting
competitive in the power management world.  Not my area of expertise,
though.

> One idea for dealing with calling "__c16__" functions from far memory would
> be to handle them indirectly.  For every "__c16__" function "foo", you could
> generate the original function "foo" and a trampoline "foo_c20" which
> consists of "calla foo; reta" and is force to reside in the near code area.
>  Any functions placed in the "__far" address space would call "foo_c20"
> instead of "foo".  Any functions declared with the "__c20__" attribute would
> make "foo_c20" an alias for "foo".  This would mean that code that is mostly
> in the lower memory would be smaller (unused "foo_c20" trampolines could be
> garbage-collected by the linker), faster, and use less stack space, while
> the occasional "__far" function would still work correctly.

So far, I haven't seen a motivation for trampolines.  calla/call and
reta/ret are the same size, so there's no cost for using 20-bit
address constants, and __c20__ itself doesn't impose any space
overhead.  Off-hand, if you don't have function pointers as data
objects thus requiring use of -msr20, building with -mc20 should have
no impact on code size.

The only time I've though of that you might want a trampline is if you
needed to access a binary library that had __c16__ code from far
memory.  I don't think that's something the compiler needs to support
directly.

Did you have some other use case in mind?

Peter

>
>
> mvh.,
>
> David
>
>
>
>
>
> On 27/04/2012 10:23, Peter Bigot wrote:
>>
>> On Thu, Apr 26, 2012 at 11:06 PM, Wayne Uroda<wayne.ur...@grabba.com>
>> wrote:
>>>
>>> Hi Peter,
>>>
>>> This is amazing work. I am very excited for 20bit support in
>>> mspgcc.
>>>
>>> After reading the document I want to ask what may be an extremely
>>> out-there question:
>>>
>>> I gather that most people who require far code support should use
>>> the compiler options -mc20 and -msr20. I am not sure how wasteful
>>> the __sr20__ attribute on all functions will actually be, but it
>>> feels like a waste of RAM if say less than 5% of functions actually
>>> make use of 20bit registers (am I correct in saying, in general,
>>> only functions which perform pointer arithmetic on function
>>> pointers will generate 20bit register values? [ignoring far data
>>> pointers and explicit use of 20b integer types]).
>>
>>
>> I don't think you can do pointer arithmetic on function pointers.  I
>> do think it's fairly likely that data pointers will be kept in
>> registers.  For the most general concept of "large" memory model,
>> all data pointers will be 20-bit, whether what they point to is in
>> near or far memory.  E.g., when iterating over an array, the address
>> should be in a register.
>>
>> Part of my conservative advice to always use -msr20 if doing
>> anything with 20 bits is that I can't prevent gcc from choosing to
>> place the address of (say) printf into a register and calling it
>> through that, if some optimization decides that doing so is a good
>> idea.  In such a case, the value must be preserved.  The back-end has
>> little influence on the register allocator and middle-end
>> optimizations.
>>
>>>
>>> Could it perhaps be more efficient (overall) for a function which
>>> makes use of 20bit values in registers to save its own 20b
>>> registers before calling a function which could potentially
>>> overwrite them?
>>>
>>> Eg. uint16_t vals[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0x8001, 0x8002,
>>> 0x8002, 0x8004, 0x8005, 0xf000 }; /* ... */ { uint20_t sv; sv = 0;
>>> for (i = 0; i<  sizeof(vals)/sizeof(*vals); ++i) { sv += vals[i];
>>> // BEFORE CALLING PRINTF (OR ANY FUNCTION WHICH IS NOT KNOWN TO BE
>>> __sr20__) THE COMPILER/ASSEMBLER CAN LOCALLY SAVE ANY 20b REGISTERS
>>> HERE printf("Add %04" PRIx16 " produces %05" PRIx20 "\n", vals[i],
>>> sv); // ON RETURN FROM FUNCTION 20b REGISTERS ARE RESTORED } }
>>>
>>> I am not saying we should change the calling convention - all
>>> functions still save and restore 16b registers in their prologues
>>> and epilogues as required. I am just thinking that the compiler
>>> could automatically detect which registers are 20bit and need to be
>>> saved using pushx.a/pushm.a instructions and do the save/restore
>>> locally (before pushing and after popping any stack based function
>>> parameters). There is every chance that the registers themselves
>>> will not be modified, which means that the 4 bytes of storage are
>>> wasted for each register. If they *are* used by the called function
>>> (or one of its called functions) they will be saved twice (or more
>>> times), which uses slightly more storage than necessary. I wonder
>>> if overall this could be beneficial in a program where in the vast
>>> majority of the time 20bit registers are not used and thus the
>>> __sr20__ attribute on ALL functions could be wasteful. Also it
>>> means that an __sr20__ multilib would not be required.
>>>
>>> Just an idea, I can only assume that this functionality would be
>>> possible to implement - whether it is a good idea or a bad idea I
>>> leave up to you.
>>
>>
>> I can imagine there are cases when this could save space, which is
>> partly why I allow the feature to be controlled on a per-function
>> level: as long as you know the call graph reachable from isolated
>> uses of 20-bit values, you need only protect those functions.  And
>> all interrupts, of course; this optimization wouldn't change the need
>> for sr20 on interrupt handlers.
>>
>> My feeling is that this approach complicates the mental model of the
>> ABI.  It's simple to promise that r4-r11 are preserved by a call,
>> while r12-r15 are not.  It becomes more complicated to say that only
>> 16 bits of r4-r11 are preserved over a call, and the caller must
>> save any registers that might hold 20-bit values.
>>
>> At this time, I'd say let's stick with the simple model until we see
>> what happens.  If it turns out that the impact is too high and
>> compiler support is needed to reduce it, we can revisit the idea
>> then.
>>
>> Other views?
>>
>> Peter
>>
>>
>>>
>>>
>>> - Wayne
>>>
>>>
>>> -----Original Message----- From: Peter Bigot
>>> [mailto:big...@acm.org] Sent: Friday, 27 April 2012 7:31 AM To: GCC
>>> for MSP430 - http://mspgcc.sf.net Subject: [Mspgcc-users] Draft
>>> 20-Bit Design/Interface Specification available for review
>>>
>>> The first draft of the design specification for 20-bit support in
>>> mspgcc has been added to the wiki at:
>>>
>>>
>>> https://sourceforge.net/apps/mediawiki/mspgcc/index.php?title=Gcc47:20-Bit_Design
>>>
>>>
>>>
> Interested parties are invited to read the document on the wiki, and
>>>
>>> comment on this mailing list.
>>>
>>> The specification focuses on the atomic capabilities which combine
>>> to form what's normally called a "memory model".  They are highly
>>> orthogonal, although some combinations are likely to result in
>>> unexpected (i.e., buggy) behavior.
>>>
>>> I specifically invite comment on exactly what memory models should
>>> be supported by roll-up options that enable/disable the individual
>>> features described on that page, and what compiler option will
>>> identify them.  (It'd be something like "-mmodel=large", but what
>>> exactly should "large" mean?)  Preferably the naming and semantics
>>> of these models should tie back to existing practice in other
>>> MSP430 toolchains or other GCC back-ends.
>>>
>>> At this time, I believe what's described is technically possible,
>>> but the management of the namespaces, correct specification of
>>> pointer types based on code and options, and exactly how to get the
>>> linker to assign data and function objects into the split address
>>> space are yet to be implemented, and some showstopper issue might
>>> arise.  (As far as I can tell nobody's ever made binutils support a
>>> split address space before.  Getting 20-bit integers to work
>>> produced five bug reports against upstream gcc so far, all of which
>>> have been fixed in mspgcc with at least one already fixed upstream
>>> as well.)
>>>
>>> Peter

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Mspgcc-users mailing list
Mspgcc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mspgcc-users

Reply via email to