D programming practices: object construction order

Denis Koroskin Fri, 06 Mar 2009 12:25:14 -0800

It is a long and boring post so you may want to go right to the conclusion.


Consider the following class hierarchy:

class A
{
// fields and virtual methods
};

class B : public A
{
// fields and virtual methods
};

class C : public B
{
// fields and virtual methods
};

Here is what happens when object of type C is constructed in C++:

1) Allocate memory to store C
2) Call A.ctor on object. This initializes vtbl (with pointers to methods of A) 
and passes control to user code
3) Call B.ctor on object. This initializes vtbl (with pointers to methods of B) 
and passes control to user code
4) Call C.ctor on object. This initializes vtbl (with pointers to methods of B) 
and passes control to user code

Let's call it bottom to top object construction.

We are studying good practices now, assume that A, B and C ctor are written 
well and initialize all of its variables (or *intentionally* leave them 
uninitialized).

Since base class' ctor is forcibly called prior to running user-code, it is 
impossible to access uninitialized member at any time in C++:

B::B(/*...*/) : /*...*/
{
   // It is enforced by compiler that all of the A members are already 
initialized by now
   // We also can not access any of the C members directly or indirectly (via 
virtual functions)
}

Now let's take a look at D.

In D, we have a different object construction order:
1) Allocate memory to store C
2) Call C.ctor on object and initialize in with C.init. This includes vtbl 
intialization with pointers to methods of C.
3) Give control to user-code so that user himself decides when parent classes' 
ctors need to be run.

So the question is - when do we run parent class ctor: at the beginning or at 
the end?

Since we are all talking about non-nullable types, we must be sure that they 
are indeed fully initialized before we access them:

class B
{
   this() { foo = new Foo(); }

   Foo foo;
}

class C : B
{
   // example 1:
   this()
   {
       writefln(foo.toString()); // error, foo is not initialized yet
       super();
   }

// example 2:

   this()
   {
       super();
       writefln(foo.toString()); // fine, foo is initialized
   }
}

So here is recommendation №1:
- don't access base class members before base class ctor is run

What about virtual functions? Consider the following example:

class B
{
   string toString() { return super.toString() ~ ", B: " ~ foo.toString(); }
   Foo foo;
}

class C
{
   string toString()
   {
       return super.toString() ~ ", C: " ~ bar.toString();
   }

   Bar bar;

   // example 1:
   this()
   {
       writefln(toString()); // Dang! foo and bar are not initialized but 
accessed
       bar = new Bar();
       super();
   }

   // example 2:
   this()
   {
       bar = new Bar();
       super();
       writefln(toString()); // fine, foo is initialized
   }
}

So here are recommendation №2:
- initialize all your variables and call super() before you call any member 
function

A consequence from recommendation 2:
- don't pass 'this' to any function or store globally before you initialize all 
your variables and call super(). Static methods are ok, because they don't have 
access to 'this' (unless it is passed as one of the parameters, of course).

What about virtual functions called inside base class ctor? Here is an example:

class B : A
{
   Foo foo;
   string toString() { return super.toString() ~ ", B: " ~ foo.toString(); }

   this()
   {
       foo = new Foo();
       writefln(toString()); // Dang! C.bar is not initialized yet. See below
   }
}

class C : B
{
   Bar bar;
   string toString() { return super.toString() ~ ", C: " ~ bar.toString(); }

   this()
   {
       super();
       bar = new Bar();
   }
}

And here is a gotcha: since vtbl is constructed differently in D, we have no 
pure virtual function call errors. But we are able to access variables that are 
not initialized yet.

Here comes recommendataion №3:
- initialize all you variables *before* you call base class ctor.

Now this is something that is different from C++, different from what we are 
used to. But this is the way we need to follow to make sure our fields are not 
accessed before initialized.

Conclusion
----------

Since D follows object construction order different from C++, here is a 
recommended one:

class Foo : public Bar
{
   this()
   {
       // Initialize all your variables.
       // This includes leaving some of them default-initialized on purpose 
(unless they are non-nullable).
       // You shouldn't not call any member fields and functions yet.

       super();

       // now do something useful (object registration etc)!

       // Your object is *fully* and *correctly* constructed by now.
       // You may call any functions without any risk of accessing 
uninitialized members.
   }
}

I think this is the only correct way to follow. I even believe that it should 
be statically enforced by compiler. It should certainly be if we want to see 
non-nullable types in D one day.

What do you think?

D programming practices: object construction order

Reply via email to