On Friday, 7 March 2014 at 02:57:38 UTC, Walter Bright wrote:
Yes, so that the user selects it, rather than having it wired in everywhere and the user has to figure out how to defeat it.

BTW you know what would help this? A pragma we can attach to a struct which makes it a very thin value type.

pragma(thin_struct)
struct A {
   int a;
   int foo() { return a; }
   static A get() { A(10); }
}

void test() {
    A a = A.get();
    printf("%d", a.foo());
}

With the pragma, A would be completely indistinguishable from int in all ways.

What do I mean?
$ dmd -release -O -inline test56 -c

Let's look at A.foo:

A.foo:
   0:   55                      push   ebp
   1:   8b ec                   mov    ebp,esp
   3:   50                      push   eax
4: 8b 00 mov eax,DWORD PTR [eax] ; waste!
   6:   8b e5                   mov    esp,ebp
   8:   5d                      pop    ebp
   9:   c3                      ret


It is line four that bugs me: the struct is passed as a *pointer*, but its only contents are an int, which could just as well be passed as a value. Let's compare it to an identical function in operation:

int identity(int a) { return a; }

00000000 <_D6test568identityFiZi>:
   0:   55                      push   ebp
   1:   8b ec                   mov    ebp,esp
   3:   83 ec 04                sub    esp,0x4
   6:   c9                      leave
   7:   c3                      ret

lol it *still* wastes time, setting up a stack frame for nothing. But we could just as well write asm { naked; ret; } and it would work as expected: the argument is passed in EAX and the return value is expected in EAX. The function doesn't actually have to do anything.


Anywho, the struct could work the same way. Now, I understand that we can't just change this unilaterally since it would break interaction with the C ABI, but we could opt in to some thinner stuff with a pragma.


Ideally, the thin struct would generate this code:

void A.get() {
   naked { // no need for stack frame here
       mov EAX, 10;
       ret;
   }
}

return A(10); when A is thin should be equal to return 10;. No need for NRVO, the object is super thin.

void A.foo() {
   naked { // no locals, no stack frame
       ret; // the last argument (this) is passed in EAX
            // and the return value goes in EAX
            // so we don't have to do anything
   }
}

Without the thin_struct thing, this would minimally look like

mov EAX, [EAX];
ret;

Having to load the value from the this pointer. But since it is thin, it is generated identically to an int, like the identity function above, so the value is already in the register!

Then, test:

void test() {
    naked { // don't need a stack frame here either!
        call A.get;
        // a is now in EAX, the value loaded right up
        call A.foo; // the this is an int and already
                    // where it needs to be, so just go
        // and finally, go ahead and call printf
        push EAX;
        push "%d".ptr;
        call printf;
        ret;
    }
}


Then, naturally, inlining A.get and A.foo might be possible (though I'd love to write them in assembly myself* and the compiler prolly can't inline them) but call/ret is fairly cheap, especially when compared to push/pop, so just keeping all the relevant stuff right in registers with no need to reference can really help us.

pragma(thin_struct)
struct RangedInt {
  int a;
  RangedInt opBinary(string op : "+")(int rhs) {
   asm {
     naked;
add EAX, [rhs]; // or RDI on 64 bit! Don't even need to touch the stack! **
     jo throw_exception;
     ret;
   }
  }
}


Might still not be as perfect as intrinsics like bearophile is thinking of... but we'd be getting pretty close. And this kind of thing would be good for other thin wrappers too, we could magically make smart pointers too! (This can't be done now since returning a struct is done via hidden pointer argument instead of by register like a naked pointer).

** i'd kinda love it if we had an all-register calling convention on 32 bit too.... but eh oh well

Reply via email to