On 6/19/2012 1:58 PM, Manu wrote:
I find a thorough suite of architecture intrinsics are usually the fastest and
cleanest way to the best possible code, although 'naked' may be handy in this
circumstance too...
Do a grep for "naked" across the druntime library sources. For example, its use
in druntime/src/rt/alloca.d, where it is very much needed, as alloca() is one of
those "magic" functions.
If a function is written from intrinsics, then it can inline and better adapt to
the calling context. It's very common that you use asm to write super-efficient
micro-function (memory copying/compression, linear algebra, matrix routines,
DSPs, etc), which are classic candidates for being inlined.
So I maintain, naked is useful, but asm is not (assuming you have a high level
way to address registers like the stack pointer directly).
Do a grep for "asm" across the druntime library sources. Can you justify all of
that with some other scheme?
Thinking more about the implications of removing the inline asm, what would
REALLY roxors, would be a keyword to insist a variable is represented by a
register, and by extension, to associate it with a specific register:
register int x; // compiler assigns an unused register, promises
it will remain resident, error if it can't maintain promise.
register int x : rsp; // x aliases RSP; can now produce a function
pre/postable in high level code.
Repeat for the argument registers -> readable, high-level custom calling
conventions!
This was a failure in C.
This would almost entirely eliminate the usefulness of an inline assembler.
Better yet, this could use the 'new' attribute syntax, which most agree will
support arguments:
@register(rsp) int x;
Some C compilers did have such pseudo-register abilities. It was a failure in
practice.
I really don't understand preferring all these rather convoluted enhancements to
avoid something simple and straightforward like the inline assembler. The use of
IA in the D runtime library, for example, has been quite successful.
For example, consider this bit from druntime/src/rt/lifetime.d:
-------------------------------------------------------------------
auto isshared = ti.classinfo is TypeInfo_Shared.classinfo;
auto bic = !isshared ? __getBlkInfo((*p).ptr) : null;
auto info = bic ? *bic : gc_query((*p).ptr);
auto size = ti.next.tsize();
version (D_InlineAsm_X86)
{
size_t reqsize = void;
asm
{
mov EAX, newcapacity;
mul EAX, size;
mov reqsize, EAX;
jc Loverflow;
}
}
else
{
size_t reqsize = size * newcapacity;
if (newcapacity > 0 && reqsize / newcapacity != size)
goto Loverflow;
}
// step 2, get the actual "allocated" size. If the allocated size does not
// match what we expect, then we will need to reallocate anyways.
// TODO: this probably isn't correct for shared arrays
size_t curallocsize = void;
size_t curcapacity = void;
size_t offset = void;
size_t arraypad = void;
----------------------------------------------