https://issues.dlang.org/show_bug.cgi?id=14912
Issue ID: 14912
Summary: Move initialisation of GC'd struct and class data from
the callee to the caller
Product: D
Version: D2
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: P1
Component: dmd
Assignee: [email protected]
Reporter: [email protected]
Currently, druntime will initialise all GC'd data in the caller.
Examples:
_d_newclass():
p[0 .. ci.init.length] = ci.init[];
_d_newitemT():
memset(p, 0, _ti.tsize);
_d_newitemiT():
memcpy(p, init.ptr, init.length);
In each example, results in a system call. And because the implementation is
always hidden away, the optimizer (or an optimizing backend) cannot assume
anything about the contents of the pointer returned in these calls.
For instance, in very simple case:
class A
{
int foo () { return 42; }
}
int test()
{
A a = new A(), b = a;
return b.foo();
}
If the contents of 'a' set by the caller in the compiler, we would have the
following codegen (pseudo-code):
int test()
{
struct A *a;
struct A *b;
a = new A();
*a = A.init;
b = a;
return b.__vptr.foo(b);
}
>From that, an optimizer can break down and inline the default initializer
without the need for memset/memcpy:
// ...
a = new A();
a.__vptr = &typeid(A).vtbl
a.__monitor = null;
// ...
Perform constant propagation to replace all occurrences of b with a:
// ...
return *(a.__vptr + 40)(a);
// ...
Global value numbering to resolve the lookup in the vtable, and de-virtualize
the call:
// ...
return A.foo(a);
// ...
After some dead code removal, the inliner now sees the direct call and is ready
to inline A.foo:
int test()
{
struct A *a = new A();
a.__vptr = typeid(A).vtbl.ptr
a.__monitor = null;
return 42;
}
There is another challenge here to remove the dead GC allocation (that will
have to wait for another bug report). But I think that this simple change is
justified by the opportunity to produce much better resulting code when using
classes in at least simple ways - haven't even considered possibilities when
considering LTO.
If there's no objections, I suggest that we should make a push for this. It
will require dmd to update its own NewExp::toElem, and to remove the
memcpy/memset parts from druntime.
--