Re: struct parameterless constructor

Jonathan M Davis via Digitalmars-d-learn Tue, 05 Aug 2025 02:29:24 -0700

On Monday, August 4, 2025 5:21:53 PM Mountain Daylight Time H. S. Teoh via 
Digitalmars-d-learn wrote:
> Fast forward 20 or so years, and things have changed a bit.  People
> started using structs for many other things, some beyond the original
> design, and inevitably ran into cases where they really needed
> parameterless default ctors for structs.  But since the language did not
> support this, a workaround was discovered: the no-op default ctor that
> the compiler generates for each struct (i.e., this()), was tagged with
> @disable, which is a mechanism to indicate that the initial by-value
> blit of the struct is *not* a valid initial state.  Then opCall was used
> so that you could construct the struct like this:
>
>   auto s = S();
>
> which resembles default construction for class objects, at least
> syntactically.  This wasn't in line with the original design of the
> language, but it worked with current language features and got people
> what they wanted without further language changes, so it was left at
> that, and became the de facto standard workaround for the language's
> lack of default struct ctors.


I know that this advice has been given plenty of times the past, but I'd
actually strongly advise against ever doing this. It is less error-prone to
simply using an explicitly named static function - to the point that it
should actually probably be illegal to declare a static opCall which can be
called with no arguments. This is because

    S s;

and

    auto s = S();

and

    auto s = S.init;

are all subtly different, and adding a static opCall makes the situation
worse.

For a struct that's not nested (or a nested struct which is static) and
which does not disable default initialization,

    S s;

and

    auto s = S();

and

    auto s = S.init;

are all identical. In all cases, the struct will be initialized with its
init value. This is the same state that a struct has prior to any of its
constructors being called if it's initialized via a constructor.

However, if the struct disables default initialization, then

    S s;

and

   auto s = S();

will fail to compile, whereas

   auto s = S.init;

will compile just fine. This is because the first two attempt to use default
initialization, whereas the third explicitly gives the variable its init
value. Because of this, some folks advise using S() instead of S.init when
you need to explicitly default-initialize a value when you can't just
declare the variable and let it be default-initialized (e.g. when passing it
to a function). And for a struct to overload static opCall screws with that.
Of course, it can then be argued that what really needs to happen is that
you do something like

    foo(defaultInit!S())

instead of

    foo(S());

where defaultInit is something like

    T defaultInit(T)()
    {
        T retval;
        return retval;
    }

but regardless, there is code in the wild which will use S() instead of
S.init to get the default-initialized value in order to not compile when the
type cannot be default initialized, which makes static opCall error-prone
(at least if it's going to be used with anyone else's code - and especially
if it's going to be used with generic code).

This is one of the negative consequences of having introduced the ability to
disable default initialization. But it's not the only major problem here.
There's also the issue of non-static nested functions.

For instance, if we take the code

    void main()
    {
        import std.stdio;

        string str;

        struct S
        {
            int i;

            void foo()
            {
                str ~= "foo";
            }
        }

        S s;
        s.foo();
        writeln(str);
    }

it will run just fine and print print "foo". The same would be true if you
change

    S s;

to

    auto s = S();

However, if you change it to

    auto s = S.init;

it will segfault. This is because S() is treated as a default-initialized
value, which is subtly different from the init value. The init value is what
the struct is initialized to prior to any of its constructors being run, and
if there are no constructors which are run, and the struct is not nested (or
is static), then no additional code is run to initialize the variable.
However, if it's nested and not static, then it has a context pointer which
points to its outer scope. Default initialization - and S() - will
initialize the context pointer (including prior to any constructor calls),
whereas if you explicitly initialize the variable with the init value, then
it's just going to be the init value, and the context pointer in the init
value is null - hence the segfault.

So, code dealing with nested structs needs to ensure that those structs are
default-initialized rather than simply given their init value. Of course,
originally, there shouldn't have been any difference between the two, but
the fact that the ability to give structs context pointers like that was
added to the language made it so that default initialization and the init
value aren't actually the same thing any longer.

And this is particularly annoying with any parts of the language which need
to use the init value to initialize things (e.g arrays and out parameters),
because they're not going to work properly with types that need more
initialization than that unless they're explicitly given a value.

So, both of these language improvements made default-initialization more
complex and made it so that you have to be that much more careful with how
you initialize types. In the general case, using the init value explicitly
should not be done, because it's not actually default initialization any
longer. There are cases where it's still appropriate, but it should be used
with care.

Similarly, if it weren't for some folks using static opCall for factory
functions, then S() could be used to ensure default initialization. But
because some folks use static opCall for factory functions, using S() is
error-prone. And because some folks use S() for default initialization,
using static opCall for factory functions is also error-prone. Basically,
S() shouldn't ever be used for structs.

So, because of this mess, I would strongly advise against anyone using
static opCall without any arguments. It's begging to be shot yourself in the
foot.

Similarly, I'd advise against using S() to do default initialization,
because some folks use static opCall, and the S() won't do the right thing
any longer for code that expects it to be default initialization.

And explicitly using the init value should only be done with extreme care.
Unlike the other two, it's still appropriate at times, but it needs to be
handled carefully (particularly in generic code).

So, in effect, adding language features has taken a feature which is nice
and simple and made it rather error-prone (particularly for generic code).

And to solve the default initialization problem that some folks have tried
to solve with S(), we really probably should add an appropriate template
helper to Phobos to be used instead.

So, the TLDR is that variables should only ever be default-iniatialized by
simply declaring them - and that structs should never be constructed with no
arguments whether it's to default-iniatialize them or if it's to use a
factory functions. Factory functions should just get their own names.

- Jonathan M Davis

Re: struct parameterless constructor

Reply via email to