Rory McGuire wrote: > On Wed, 21 Jul 2010 03:58:33 +0200, bearophile <[email protected]> > wrote: > >> Andrei Alexandrescu: >> >>> emplace(), defined in std.conv, is relatively new. I haven't yet added >>> emplace() for class objects, and this is as good an opportunity as any: >>> http://www.dsource.org/projects/phobos/changeset/1752 >> >> Thank you, I have used this, and later I have done few tests too. >> >> The "scope" for class instantiations can be deprecated once there is an >> acceptable alternative. You can't deprecate features before you have >> found a good enough alternative. >> >> --------------------- >> >> A first problem is the syntax, to allocate an object on the stack you >> need something like: >> >> // is testbuf correctly aligned? >> ubyte[__traits(classInstanceSize, Test)] testbuf = void; >> Test t = emplace!(Test)(cast(void[])testbuf, arg1, arg2); >> >> >> That is too much worse looking, hairy and error prone than: >> scope Test t = new Test(arg1, arg2); >> >> >> I have tried to build a helper to improve the situation, like something >> that looks: >> Test t = StackAlloc!(Test, arg1, arg2); >> >> But failing that, my second try was this, not good enough: >> mixin(stackAlloc!(Test, Test)("t", "arg1, arg2")); >> >> --------------------- >> >> A second problem is that this program compiles with no errors: >> >> import std.conv: emplace; >> >> final class Test { >> int x, y; >> this(int xx, int yy) { >> this.x = xx; >> this.y = yy; >> } >> } >> >> Test foo(int x, int y) { >> ubyte[__traits(classInstanceSize, Test)] testbuf = void; >> Test t = emplace!(Test)(cast(void[])testbuf, x, y); >> return t; >> } >> >> void main() { >> foo(1, 2); >> } >> >> >> >> While the following one gives: >> test.d(13): Error: escaping reference to scope local t >> >> >> import std.conv: emplace; >> >> final class Test { >> int x, y; >> this(int xx, int yy) { >> this.x = xx; >> this.y = yy; >> } >> } >> >> Test foo(int x, int y) { >> scope t = new Test(x, y); >> return t; >> } >> >> void main() { >> foo(1, 2); >> } >> >> >> So the compiler is aware that the scoped object can't escape, while >> using emplace things become more bug-prone. "scope" can cause other >> bugs, time ago I have filed a bug report about one problem, but it >> avoids the most common bug. (I am not sure the emplace solves that >> problem with scope, I think it shares the same problem, plus adds new >> ones). >> >> --------------------- >> >> A third problem is that the ctor doesn't get called: >> >> >> import std.conv: emplace; >> import std.c.stdio: puts; >> >> final class Test { >> this() { >> } >> ~this() { puts("killed"); } >> } >> >> void main() { >> ubyte[__traits(classInstanceSize, Test)] testbuf = void; >> Test t = emplace!(Test)(cast(void[])testbuf); >> } >> >> >> That prints nothing. Using scope it gets called (even if it's not >> present!). >> >> --------------------- >> >> This is not a problem of emplace(), it's a problem of the dmd optimizer. >> I have done few tests for the performance too. I have used this basic >> pseudocode: >> >> while (i < Max) >> { >> create testObject(i, i, i, i, i, i) >> testObject.doSomething(i, i, i, i, i, i) >> testObject.doSomething(i, i, i, i, i, i) >> testObject.doSomething(i, i, i, i, i, i) >> testObject.doSomething(i, i, i, i, i, i) >> destroy testObject >> i++ >> } >> >> >> Coming from here: >> http://www.drdobbs.com/java/184401976 >> And its old timings: >> http://www.ddj.com/java/184401976?pgno=9 >> >> >> The Java version of the code is simple: >> >> final class Obj { >> int i1, i2, i3, i4, i5, i6; >> >> Obj(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) { >> this.i1 = ii1; >> this.i2 = ii2; >> this.i3 = ii3; >> this.i4 = ii4; >> this.i5 = ii5; >> this.i6 = ii6; >> } >> >> void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int >> ii6) { >> } >> } >> >> class Test { >> public static void main(String args[]) { >> final int N = 100_000_000; >> int i = 0; >> while (i < N) { >> Obj testObject = new Obj(i, i, i, i, i, i); >> testObject.doSomething(i, i, i, i, i, i); >> testObject.doSomething(i, i, i, i, i, i); >> testObject.doSomething(i, i, i, i, i, i); >> testObject.doSomething(i, i, i, i, i, i); >> // testObject = null; // makes no difference >> i++; >> } >> } >> } >> >> >> >> This is a D version that uses emplace() (if you don't use emplace here >> the performance of the D code is very bad compared to the Java one): >> >> // program #1 >> import std.conv: emplace; >> >> final class Test { // 32 bytes each instance >> int i1, i2, i3, i4, i5, i6; >> this(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) { >> this.i1 = ii1; >> this.i2 = ii2; >> this.i3 = ii3; >> this.i4 = ii4; >> this.i5 = ii5; >> this.i6 = ii6; >> } >> void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int >> ii6) { >> } >> } >> >> void main() { >> enum int N = 100_000_000; >> >> int i; >> while (i < N) { >> ubyte[__traits(classInstanceSize, Test)] buf = void; >> Test testObject = emplace!(Test)(cast(void[])buf, i, i, i, i, i, >> i); >> // Test testObject = new Test(i, i, i, i, i, i); >> // scope Test testObject = new Test(i, i, i, i, i, i); >> testObject.doSomething(i, i, i, i, i, i); >> testObject.doSomething(i, i, i, i, i, i); >> testObject.doSomething(i, i, i, i, i, i); >> testObject.doSomething(i, i, i, i, i, i); >> testObject = null; >> i++; >> } >> } >> >> >> The Java code (server) runs in about 0.25 seconds here. >> The D code (that doesn't do heap allocations at all) run in about 3.60 >> seconds. >> >> With a bit of experiments I have seen that emplace() doesn't get >> inlined, and the cause is it contains enforce(). enforce contains a >> throw, and it seems dmd doesn't inline functions that can throw, you can >> test it with a little test program like this: >> >> >> import std.c.stdlib: atoi; >> void foo(int b) { >> if (b) >> throw new Throwable(null); >> } >> void main() { >> int b = atoi("0"); >> foo(b); >> } >> >> >> So if you comment out the two enforce() inside emplace() dmd inlines >> emplace() and the running time becomes about 2.30 seconds, less than ten >> times slower than Java. >> >> If emplace() doesn't contain calls to enforce() then the loop in main() >> becomes (dmd 2.047, optmized build): >> >> >> L1A: push dword ptr 02Ch[ESP] >> mov EDX,_D10test6_good4Test7__ClassZ[0Ch] >> mov EAX,_D10test6_good4Test7__ClassZ[08h] >> push EDX >> push ESI >> call near ptr _memcpy >> mov ECX,03Ch[ESP] >> mov 8[ECX],EBX >> mov 0Ch[ECX],EBX >> mov 010h[ECX],EBX >> mov 014h[ECX],EBX >> mov 018h[ECX],EBX >> mov 01Ch[ECX],EBX >> inc EBX >> add ESP,0Ch >> cmp EBX,05F5E100h >> jb L1A >> >> >> (The memcpy is done by emplace to initialize the object before calling >> its ctor. You must perform the initialization because it needs the >> pointer to the virtual table and monitor. The monitor here was null. I >> think a future LDC2 can optimize away more stuff in that loop, so it's >> not so bad). >> >> >> If you use this in program #1: >> scope Test testObject = new Test(i, i, i, i, i, i); >> It runs in about 6 seconds (also because the ctor is called even if's >> missing). >> >> If in program #1 you use just new, without scope, the runtime is about >> 27.2 seconds, about 110 times slower than Java. >> >> Bye, >> bearophile > > Takes 18m27.720s in PHP :)
Takes 5m26.776s in Python. Takes 0m1.008s in Java. can't test D version I don't have emplace and dsource is ignoring me.
