Thank you for explaining.
See comments inline.
On Wednesday, 10 October 2012 at 18:12:13 UTC, Jonathan M Davis
wrote:
On Wednesday, October 10, 2012 13:40:06 foobar wrote:
Can you please elaborate on where the .init property is being
relied on? This is an aspect of D I don't really understand.
What's the difference between a no-arg ctor and one with args
in
relation to this requirement?
init is used anywhere and everywhere that an instance of a type
needs to be
default-initialized. _One_ of those places is a local variable.
Not all places
where an instance of an object needs to be created can be
initialized by the
programmer. The prime example of this would be arrays. If you
declare
auto i = new int[](5);
or
int[12] s;
all of the elements in the array need to be initialized or
they'll be garbage,
and there's no way for the programmer to indicate what values
they should be.
The whole point of init is to avoid having variables ever be
garbage without
the programmer explicitly asking for it. And having a default
constructor
wouldn't help one whit with arrays, because the values of their
elements must
be known at compile time (otherwise you couldn't directly
initialize member
variables or static variables or anything else which requires a
value at
compile time with an array). With init, the compiler can take
advantage of the
fact that it knows the init value at compile time to
efficiently initialize the
array.
I understand the idea of default initialization. I was more
interested in the machinery and implementation details :) So
let's dive in into those details:
Arrays - without changing existing syntax we can use these
semantics:
auto a = new int[](5); // compiler calls T() for each instance
int[12] b; // ditto
This would be same as in C++. We could also expand the syntax and
allow:
auto b = new int[](5, 9); // init all instances to 9
auto b = new int[](5, int (int index) { return index; });
initializes each member via a function call.
This can be generalized for multi dimensions.
But even constructing objects sanely relies on init. All
user-defined objects
are fully initialized to what their member variables are
directly initialized
to before their constructors are even called. In the case of a
struct, that's
the struct's init value. It's not for a class, because you
can't have a class
separate from its reference (so it's the reference which gets
the init value),
but the class still has a state equivalent to a struct's init
value, and
that's the state that it has before any of its constructors are
called.
So for classes .init is null which complicates non-nullable
classes. It seems the "solution" (more like a hack IMO) of
@disable _breaks_ the .init guaranty in the language.
If it weren't for that, you'd get the insanity that C++ or Java
have with
regards to the state of objects prior to construction. C++ is
particularly bad
in that each derived class is created in turn, meaning that
when a constructor
is called, the object _is_ that class rather than the derived
class that
you're ultimately constructing (which means that things can go
horribly wrong
if you're stupid enough to call a virtual function from a
constructor in C++).
I believe that Java handles that somewhat better, but it gets
bizarre ordering
issues with regards to initializing member variables that cause
problems if
you try and alter member variables from base classes inside of
a derived
constructor. With D, the object is guaranteed to be in a sane
state prior to
construction.
C++ is insanely bad here mainly due to [virtual?] MI which
doesn't affect D
and Java _allows_ virtual methods in constructors, which I think
is also "fixed" in the latest c++ standard. I don't know about
the ordering problems you mention but AFAIK the complication
arises with MI, not default initialization. It's just a matter of
properly defining the inheritance semantics.
And without init, even if every place that an object is
instantiated could be
directly initialized by the programmer (which it can't), then
you would either
end up with garbage every time that a variable isn't directly
initialized, or
you'd have to directly initialize them all. In order for D's
construction
model to work, this would include directly initializing _all_
member variables
even if the constructor then set them to something else (which
would actually
cause problems with const and immutable). And that would get
_very_ annoying,
even if it would be preferable for the local variable to
require explicit
initialization.
You talk about:
class C {
immutable T val; // what to do here?
this() { ... }
}
This can be solved be either requiring a ctor call at # or if
none specified call T(), or we can require the init to happen in
the ctor a-la C++ semantics.
Another case where init is required is out parameters. All out
parameters are
set to their init value when the function is called in order to
avoid bugs
caused by reading the value of an out parameter before it's set
within the
function. That wouldn't work at all without init.
Personally, I'd just get remove this feature from the lanuage,
tuples are a far better design for returning multiple values and
even with this feature intact, we could always use the default
no-arg constructor.
E.g
void foo(out T val);
becomes:
void foo(out T val = T());
One of the more annoying AA bugs makes it so that if the foo
function in this
code
aa[5] = foo();
throws, then aa[5] gets set with a init value of the element
type. While this
clearly shouldn't happen, imagine how much worse it would be if
we didn't have
init, and that element got set to garbage?
I don't get this example. If foo throws than the calling code
will get control. How would you ever get to read that garbage in
aa[5]? The surrounding try catch block should take care of this
explicitly anyway.
E.g.
try {
aa[5] = foo(); // foo throws
// ## do something with aa[5], this won't happen
} catch {
// Please handle aa[5] here explicitly.
//
}
// @@ do something with aa[5], works due to the explicit fix in
the catch.
There are probably other cases that I can't think of right now
where init gets
used - probably in the runtime if nowhere else. Every place
that could
possibly result in a variable being garbage _doesn't_ result in
garbage,
because we have init.
And regardless of what the language does, there are definitely
places where the
standard library takes advantage of init. It uses it a lot for
type
inferrence, but it also uses it directly in places such as
std.algorithm.move.
Without init, it would end up dealing with garbage values. It's
also a
lifesaver in generic code, because without it, generic code
_can't_ initialize
variables in many cases. Take something like
T t;
if(cond)
{
...
t = getValue();
...
}
else
{
...
t = getOtherValue();
...
}
How on earth could a generic function initialize t without
T.init? void?
That's just begging for bugs when one the paths doesn't
actually set t like
it's supposed to. It doesn't know anything about the type and
therefore
doesn't know what a reasonable default value would be, so it
can't possibly
initialize t properly.
Isn't @disable breaks those algorithms in phobos anyway? how
would that work for non-nullable classes?
To answer the above question, I'd say there's nothing wrong with
init to void. This is what happens anyway since the .init isn't
used and the optimizer will optimize it away.
I can understand prefering that local variables have to be
directly
initialized by the programmer, but it just doesn't scale.
Having init is
_far_more flexible and far more powerful. Any and every
situation that might
need to initialize a variable can do it. Without init, that
just isn't
possible.
- Jonathan M Davis
Again, thanks for the explanation. I have to say that on a
general level I have to agree with Don's post and I don't see how
the .init idiom generally "works" or is useful. I can't see
anything in the above examples that shows that .init is
absolutely required and we can't live without it. The only thing
that worries me here is the reliance of the runtime/phobos on
.init.