Thank you for explaining.
See comments inline.

On Wednesday, 10 October 2012 at 18:12:13 UTC, Jonathan M Davis wrote:
On Wednesday, October 10, 2012 13:40:06 foobar wrote:
Can you please elaborate on where the .init property is being
relied on? This is an aspect of D I don't really understand.
What's the difference between a no-arg ctor and one with args in
relation to this requirement?

init is used anywhere and everywhere that an instance of a type needs to be default-initialized. _One_ of those places is a local variable. Not all places where an instance of an object needs to be created can be initialized by the programmer. The prime example of this would be arrays. If you declare

auto i = new int[](5);

or

int[12] s;

all of the elements in the array need to be initialized or they'll be garbage, and there's no way for the programmer to indicate what values they should be. The whole point of init is to avoid having variables ever be garbage without the programmer explicitly asking for it. And having a default constructor wouldn't help one whit with arrays, because the values of their elements must be known at compile time (otherwise you couldn't directly initialize member variables or static variables or anything else which requires a value at compile time with an array). With init, the compiler can take advantage of the fact that it knows the init value at compile time to efficiently initialize the
array.


I understand the idea of default initialization. I was more interested in the machinery and implementation details :) So let's dive in into those details: Arrays - without changing existing syntax we can use these semantics:

auto a = new int[](5); // compiler calls T() for each instance
int[12] b; // ditto

This would be same as in C++. We could also expand the syntax and allow:
auto b = new int[](5, 9); // init all instances to 9
auto b = new int[](5, int (int index) { return index; });
initializes each member via a function call.
This can be generalized for multi dimensions.

But even constructing objects sanely relies on init. All user-defined objects are fully initialized to what their member variables are directly initialized to before their constructors are even called. In the case of a struct, that's the struct's init value. It's not for a class, because you can't have a class separate from its reference (so it's the reference which gets the init value), but the class still has a state equivalent to a struct's init value, and that's the state that it has before any of its constructors are called.

So for classes .init is null which complicates non-nullable classes. It seems the "solution" (more like a hack IMO) of @disable _breaks_ the .init guaranty in the language.


If it weren't for that, you'd get the insanity that C++ or Java have with regards to the state of objects prior to construction. C++ is particularly bad in that each derived class is created in turn, meaning that when a constructor is called, the object _is_ that class rather than the derived class that you're ultimately constructing (which means that things can go horribly wrong if you're stupid enough to call a virtual function from a constructor in C++). I believe that Java handles that somewhat better, but it gets bizarre ordering issues with regards to initializing member variables that cause problems if you try and alter member variables from base classes inside of a derived constructor. With D, the object is guaranteed to be in a sane state prior to
construction.


C++ is insanely bad here mainly due to [virtual?] MI which doesn't affect D and Java _allows_ virtual methods in constructors, which I think is also "fixed" in the latest c++ standard. I don't know about the ordering problems you mention but AFAIK the complication arises with MI, not default initialization. It's just a matter of properly defining the inheritance semantics.



And without init, even if every place that an object is instantiated could be directly initialized by the programmer (which it can't), then you would either end up with garbage every time that a variable isn't directly initialized, or you'd have to directly initialize them all. In order for D's construction model to work, this would include directly initializing _all_ member variables even if the constructor then set them to something else (which would actually cause problems with const and immutable). And that would get _very_ annoying, even if it would be preferable for the local variable to require explicit
initialization.

You talk about:
class C {
immutable T val; // what to do here?
this() { ... }
}

This can be solved be either requiring a ctor call at # or if none specified call T(), or we can require the init to happen in the ctor a-la C++ semantics.





Another case where init is required is out parameters. All out parameters are set to their init value when the function is called in order to avoid bugs caused by reading the value of an out parameter before it's set within the
function. That wouldn't work at all without init.

Personally, I'd just get remove this feature from the lanuage, tuples are a far better design for returning multiple values and even with this feature intact, we could always use the default no-arg constructor.
E.g
void foo(out T val);
becomes:
void foo(out T val = T());


One of the more annoying AA bugs makes it so that if the foo function in this
code

aa[5] = foo();

throws, then aa[5] gets set with a init value of the element type. While this clearly shouldn't happen, imagine how much worse it would be if we didn't have
init, and that element got set to garbage?


I don't get this example. If foo throws than the calling code will get control. How would you ever get to read that garbage in aa[5]? The surrounding try catch block should take care of this explicitly anyway.

E.g.
try {
 aa[5] = foo(); // foo throws
 // ## do something with aa[5], this won't happen
} catch {
// Please handle aa[5] here explicitly.
//
}
// @@ do something with aa[5], works due to the explicit fix in the catch.




There are probably other cases that I can't think of right now where init gets used - probably in the runtime if nowhere else. Every place that could possibly result in a variable being garbage _doesn't_ result in garbage,
because we have init.

And regardless of what the language does, there are definitely places where the standard library takes advantage of init. It uses it a lot for type inferrence, but it also uses it directly in places such as std.algorithm.move. Without init, it would end up dealing with garbage values. It's also a lifesaver in generic code, because without it, generic code _can't_ initialize
variables in many cases. Take something like

T t;

if(cond)
{
 ...
 t = getValue();
 ...
}
else
{
 ...
 t = getOtherValue();
 ...
}

How on earth could a generic function initialize t without T.init? void? That's just begging for bugs when one the paths doesn't actually set t like it's supposed to. It doesn't know anything about the type and therefore doesn't know what a reasonable default value would be, so it can't possibly
initialize t properly.


Isn't @disable breaks those algorithms in phobos anyway? how would that work for non-nullable classes? To answer the above question, I'd say there's nothing wrong with init to void. This is what happens anyway since the .init isn't used and the optimizer will optimize it away.

I can understand prefering that local variables have to be directly initialized by the programmer, but it just doesn't scale. Having init is _far_more flexible and far more powerful. Any and every situation that might need to initialize a variable can do it. Without init, that just isn't
possible.

- Jonathan M Davis

Again, thanks for the explanation. I have to say that on a general level I have to agree with Don's post and I don't see how the .init idiom generally "works" or is useful. I can't see anything in the above examples that shows that .init is absolutely required and we can't live without it. The only thing that worries me here is the reliance of the runtime/phobos on .init.

Reply via email to