Re: =void in struct definition

2018-04-11 Thread Jonathan M Davis via Digitalmars-d
On Wednesday, April 11, 2018 11:31:16 Shachar Shemesh via Digitalmars-d 
wrote:
> On 11/04/18 10:58, Jonathan M Davis wrote:
> > All objects are initialized with their init values prior to the
> > constructor being called. So, whether an object is simply
> > default-initialized or whether the constructor is called, you're going
> > to get the same behavior except for the fact that the constructor would
> > normally do further initialization beyond the init value. As such, if
> > there's a problem with the
> > default-initialized value, you're almost certainly going to get the same
> > problem when you call a constructor.
> >
> > - Jonathan M Davis
>
> That's horrible!
>
> That means that constructor initialized objects, regardless of size, get
> initialized twice.

Well, only the stuff you initialize in the constructor gets initialized
twice, but yeah, it could result in effectively initializing everything
twice if you initialize everything in the constructor. It's one of those
design choices that's geared towards correctness, since it avoids ever
dealing with the type having garbage, and the fact that you can do stuff
like

struct S
{
int _i;

this(int i)
{
foo();
_i = 42;
}

void foo()
{
writeln(_i);
}
}

means that if it doesn't initialize it with the init value first, then you
get undefined behavior, because _i would then be garbage when it's read
(which isn't necessarily a big deal with an int but could really matter if
it were something like a pointer). It also factors into how classes are
guaranteed to be fully initialized to the correct type _before_ any
constructors are run (avoiding the problems that you get in C++ when calling
virtual functions in constructors or destructors). Unfortunately, because
you're allowed to call arbitrary functions before initializing members, it's
also possible to violate the type system with regards to const or immutable.
e.g.

struct S
{
immutable int _i;

this(int i)
{
foo();
_i = 42;
}

void foo()
{
writeln(_i);
}
}

reads _i before it's fully initialized, so its state isn't identical every
time it's accessed like it's supposed to be. However, because the object is
default-initialized first, you never end up reading garbage, and the
behavior is completely deterministic even if it arguably violates the type
system. What the correct solution to that particular problem is, I don't
know (probably at least disallowing calling any member functions prior to
initializing any immutable or const members), but the fact that the object
is default-initialized first reduces the severity of the problem.

And while you can end up with portions of an object effectively being
initialized twice, for your average struct, I doubt that it matters much.
It's when you start doing stuff like having large static arrays that it
really becomes a problem. It also wouldn't surprise me if ldc optimized out
some of the double-initializations at least some of the time, but I very
much doubt that dmd's optimizer is ever that smart. Depending on the
implementation of the constructor though, I would think that it would be
possible for the compiler to determine that it doesn't actually need to
default-initialize the struct first (or that it can just default-initialize
pieces of it), because it can guarantee that a member variable isn't read
before it's initialized by the constructor. So, at least in theory, the
front end should be able to do some optimizations there. However, I have no
idea if it ever does.

I think that in theory, the idea is that we want initializion to be as
correct as possible, so there should be no garbage or undefined behavior
involved, and in the case of classes, the object should be fully the type
that it's supposed to be when its constructor is called so that you don't
get bad behavior from virtual functions, but we then have = void so that
specific variables can avoid that extra initialization cost when profiling
or whatnot show that it's important. So, if you have something like

struct S
{
int _a;
int[5000] _b;

this(int a)
{
_a = a;
}
}

then it's going to behave well as far as correctness goes, and then if the
initialization is too expensive, you do

S s = void;
s._a = 42;

I think that the problem is that void initialization was intended
specifically for local variables, and the idea of = void for member
variables was not really thought through. So, you can easily do something
like

S s = void;
s._a = 42;

right now and avoid the default-initialization, but you can't cleanly do

struct S
{
int _a;
int[5000] _b = void;

this(int a)
{
_a = a;
}
}

So, the process is completely manual, which obviously sucks if it's
something that you _always_ want to do with the type.

In general, D favors correctness over peformance with the idea that it gives
you backdoors to get around the correctness guarantees in order to get 

Re: =void in struct definition

2018-04-11 Thread Shachar Shemesh via Digitalmars-d

On 11/04/18 10:58, Jonathan M Davis wrote:

All objects are initialized with their init values prior to the constructor
being called. So, whether an object is simply default-initialized or whether
the constructor is called, you're going to get the same behavior except for
the fact that the constructor would normally do further initialization
beyond the init value. As such, if there's a problem with the
default-initialized value, you're almost certainly going to get the same
problem when you call a constructor.

- Jonathan M Davis



That's horrible!

That means that constructor initialized objects, regardless of size, get 
initialized twice.


Shachar


Re: =void in struct definition

2018-04-11 Thread Jonathan M Davis via Digitalmars-d
On Wednesday, April 11, 2018 10:45:40 Shachar Shemesh via Digitalmars-d 
wrote:
> On 09/04/18 14:22, Jonathan M Davis wrote:
> > On Monday, April 09, 2018 14:06:50 Shachar Shemesh via Digitalmars-d 
wrote:
> >> struct S {
> >>
> >> int a;
> >> int[5000] arr = void;
> >>
> >> }
> >>
> >> void func() {
> >>
> >> S s;
> >>
> >> }
> >>
> >> During the s initialization, the entire "S" area is initialized,
> >> including the member arr which we asked to be = void.
> >>
> >> Is this a bug?
> >
> > It looks like Andrei created an issue about it as an enhancement request
> > several years ago:
> >
> > https://issues.dlang.org/show_bug.cgi?id=11331
> >
> > - Jonathan M Davis
>
> Except that issue talks about default constructed objects. My problem
> happens also with objects constructed with a constructor:
>
>
> extern(C) void func(ref S s);
>
> struct S {
>  uint a;
>  int[5000] arr = void;
>
>  this(uint val) {
>  a = val;
>  }
> }
>
> void main() {
>  auto s = S(12);
>
>  // To prevent the optimizer from optimizing s away
>  func(s);
> }
>
> $ ldc2 -c -O3 -g test.d
> $ objdump -S -r test.o | ddemangle > test.s
>
>  <_Dmain>:
>  }
> }
>
> void main() {
> 0:48 81 ec 28 4e 00 00sub$0x4e28,%rsp
> 7:48 8d 7c 24 04  lea0x4(%rsp),%rdi
>  auto s = S(12);
> c:31 f6   xor%esi,%esi
> e:ba 20 4e 00 00  mov$0x4e20,%edx
>13:e8 00 00 00 00  callq  18 <_Dmain+0x18>
>   14: R_X86_64_PLT32  memset-0x4
>  a = val;
>18:c7 04 24 0c 00 00 00movl   $0xc,(%rsp)
>1f:48 89 e7mov%rsp,%rdi
>
>  // To prevent the optimizer from optimizing s away
>  func(s);
>22:e8 00 00 00 00  callq  27 <_Dmain+0x27>
>   23: R_X86_64_PLT32  func-0x4
> }
>27:31 c0   xor%eax,%eax
>29:48 81 c4 28 4e 00 00add$0x4e28,%rsp
>30:c3  retq
>
>
> Notice the call to memset.
>
> Shachar

All objects are initialized with their init values prior to the constructor
being called. So, whether an object is simply default-initialized or whether
the constructor is called, you're going to get the same behavior except for
the fact that the constructor would normally do further initialization
beyond the init value. As such, if there's a problem with the
default-initialized value, you're almost certainly going to get the same
problem when you call a constructor.

- Jonathan M Davis



Re: =void in struct definition

2018-04-11 Thread Shachar Shemesh via Digitalmars-d

On 09/04/18 14:22, Jonathan M Davis wrote:

On Monday, April 09, 2018 14:06:50 Shachar Shemesh via Digitalmars-d wrote:

struct S {
int a;
int[5000] arr = void;
}

void func() {
S s;
}

During the s initialization, the entire "S" area is initialized,
including the member arr which we asked to be = void.

Is this a bug?


It looks like Andrei created an issue about it as an enhancement request
several years ago:

https://issues.dlang.org/show_bug.cgi?id=11331

- Jonathan M Davis



Except that issue talks about default constructed objects. My problem 
happens also with objects constructed with a constructor:



extern(C) void func(ref S s);

struct S {
uint a;
int[5000] arr = void;

this(uint val) {
a = val;
}
}

void main() {
auto s = S(12);

// To prevent the optimizer from optimizing s away
func(s);
}

$ ldc2 -c -O3 -g test.d
$ objdump -S -r test.o | ddemangle > test.s

 <_Dmain>:
}
}

void main() {
   0:   48 81 ec 28 4e 00 00sub$0x4e28,%rsp
   7:   48 8d 7c 24 04  lea0x4(%rsp),%rdi
auto s = S(12);
   c:   31 f6   xor%esi,%esi
   e:   ba 20 4e 00 00  mov$0x4e20,%edx
  13:   e8 00 00 00 00  callq  18 <_Dmain+0x18>
14: R_X86_64_PLT32  memset-0x4
a = val;
  18:   c7 04 24 0c 00 00 00movl   $0xc,(%rsp)
  1f:   48 89 e7mov%rsp,%rdi

// To prevent the optimizer from optimizing s away
func(s);
  22:   e8 00 00 00 00  callq  27 <_Dmain+0x27>
23: R_X86_64_PLT32  func-0x4
}
  27:   31 c0   xor%eax,%eax
  29:   48 81 c4 28 4e 00 00add$0x4e28,%rsp
  30:   c3  retq


Notice the call to memset.

Shachar


Re: =void in struct definition

2018-04-09 Thread Johan Engelen via Digitalmars-d

On Monday, 9 April 2018 at 11:06:50 UTC, Shachar Shemesh wrote:

struct S {
  int a;
  int[5000] arr = void;
}

void func() {
  S s;
}

During the s initialization, the entire "S" area is 
initialized, including the member arr which we asked to be = 
void.


Is this a bug?


Could be optimized, yes, provided that the spec is updated. We 
discussed this live at the end of my DConf talk last year, and 
Walter (in audience) agreed upon the needed spec change. I 
haven't had/taken the time to work on it yet :(


The optimization of simplifying the initialization isn't too 
hard. But it is a bit tricky, Johannes wrote down some good 
points here: https://issues.dlang.org/show_bug.cgi?id=15951
(note the padding bytes issue).  The good news is that there 
doesn't appear to be any spec about it, so technically there is 
no language breakage and currently it is an "accepts invalid" 
bug...


Over dinner me, deadalnix and some others discussed further 
optimization where emission of the large S.init could be 
eliminated. We worked out some details, but it's a little harder 
thing to do.


cheers,
  Johan



Re: =void in struct definition

2018-04-09 Thread Stefan Koch via Digitalmars-d

On Monday, 9 April 2018 at 14:11:35 UTC, jmh530 wrote:

On Monday, 9 April 2018 at 11:15:14 UTC, Stefan Koch wrote:


Not semantically, but you might consider it a performance bug.
This particular one could be fixed, put I cannot say how messy 
the details are.
There is potential for code that silently relies on the 
behavior and would break in very non-obvious ways if we fixed 
it.


If the fix causes non-obvious breakage, then why not a DIP for 
an opInit that overrides the default initialization and has the 
desired new functionality?


Though it would be annoying to have two ways of doing the same 
thing...


It's not worth a DIP.
You can write a static initializer function and pass it a 
GCAlloced pointer.


Re: =void in struct definition

2018-04-09 Thread jmh530 via Digitalmars-d

On Monday, 9 April 2018 at 11:15:14 UTC, Stefan Koch wrote:


Not semantically, but you might consider it a performance bug.
This particular one could be fixed, put I cannot say how messy 
the details are.
There is potential for code that silently relies on the 
behavior and would break in very non-obvious ways if we fixed 
it.


If the fix causes non-obvious breakage, then why not a DIP for an 
opInit that overrides the default initialization and has the 
desired new functionality?


Though it would be annoying to have two ways of doing the same 
thing...


Re: =void in struct definition

2018-04-09 Thread Steven Schveighoffer via Digitalmars-d

On 4/9/18 7:06 AM, Shachar Shemesh wrote:

struct S {
   int a;
   int[5000] arr = void;
}

void func() {
   S s;
}

During the s initialization, the entire "S" area is initialized, 
including the member arr which we asked to be = void.


Is this a bug?


Not technically. It has to initialize `a` to 0. The only way we 
initialize structs is to copy the whole initializer with memcpy.


It would be possible to leave the "tail" uninitialized, and just store 
the initializer for the first members that have non-void initializers. 
But that's not how it works now.


If that were to happen, you'd still have the same issue with things like:

struct S {
   int[5000] arr = void;
   int a;
}

But maybe that's just something we would have to live with.

-Steve


Re: =void in struct definition

2018-04-09 Thread Stefan Koch via Digitalmars-d

On Monday, 9 April 2018 at 11:15:14 UTC, Stefan Koch wrote:

On Monday, 9 April 2018 at 11:06:50 UTC, Shachar Shemesh wrote:

[ ... ]
During the s initialization, the entire "S" area is 
initialized, including the member arr which we asked to be = 
void.


Is this a bug?

Shachar


[ ... ] {This could be fixed, but may break code} [ ... ]


So currently on initalizsation we do this:
---
structPtr = cast(StructType*) alloc(structSize);
memcpy(structPtr, StructType.static_struct_initializer, 
StructType.sizeof);


which we could change to
---
structPtr = cast(StructType*) alloc(structSize);
foreach(initializerSegment;StructType.InitializerSegments)
{
memcpy((cast(void*)structPtr) + 
initializerSegment.segmentOffset,

   (cast(void*) initializerSegment.segmentPtr),
   initializerSegment.segmentSize);
}
---

This will potentially remove quite a lot of binary bloat since 
void-members do no longer need to be stored in initializers, and 
initialization overhead.
In terms of implementation this _should_ be straight-forward but 
well ... runtime and compiler interaction can be a mess.


Re: =void in struct definition

2018-04-09 Thread Jonathan M Davis via Digitalmars-d
On Monday, April 09, 2018 14:06:50 Shachar Shemesh via Digitalmars-d wrote:
> struct S {
>int a;
>int[5000] arr = void;
> }
>
> void func() {
>S s;
> }
>
> During the s initialization, the entire "S" area is initialized,
> including the member arr which we asked to be = void.
>
> Is this a bug?

It looks like Andrei created an issue about it as an enhancement request
several years ago:

https://issues.dlang.org/show_bug.cgi?id=11331

- Jonathan M Davis



Re: =void in struct definition

2018-04-09 Thread Simen Kjærås via Digitalmars-d

On Monday, 9 April 2018 at 11:06:50 UTC, Shachar Shemesh wrote:

struct S {
  int a;
  int[5000] arr = void;
}

void func() {
  S s;
}

During the s initialization, the entire "S" area is 
initialized, including the member arr which we asked to be = 
void.


Is this a bug?


https://issues.dlang.org/show_bug.cgi?id=16956

--
  Simen


Re: =void in struct definition

2018-04-09 Thread Stefan Koch via Digitalmars-d

On Monday, 9 April 2018 at 11:06:50 UTC, Shachar Shemesh wrote:

struct S {
  int a;
  int[5000] arr = void;
}

void func() {
  S s;
}

During the s initialization, the entire "S" area is 
initialized, including the member arr which we asked to be = 
void.


Is this a bug?

Shachar


Not semantically, but you might consider it a performance bug.
This particular one could be fixed, put I cannot say how messy 
the details are.
There is potential for code that silently relies on the behavior 
and would break in very non-obvious ways if we fixed it.


=void in struct definition

2018-04-09 Thread Shachar Shemesh via Digitalmars-d

struct S {
  int a;
  int[5000] arr = void;
}

void func() {
  S s;
}

During the s initialization, the entire "S" area is initialized, 
including the member arr which we asked to be = void.


Is this a bug?

Shachar