Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-19 Thread Shachar Shemesh via Digitalmars-d

On 18/05/18 22:57, kinke wrote:
I checked, and the reason is that D and C++ use a different ABI wrt. 
by-value passing of non-POD arguments. C++ indeed passes a reference to 
a caller-allocated rvalue, not just on Win64; that makes it trivial, as 
there are no moves across call boundaries. But your proposal may imply 
changing the D ABI accordingly.


That seems to be the case.

Assuming https://dlang.org/spec/abi.html is the ABI you refer to, there 
is very little in way of argument calling there:


https://dlang.org/spec/abi.html#function_calling_conventions

"
The extern (C) and extern (D) calling convention matches the C calling 
convention used by the supported C compiler on the host system. Except 
that the extern (D) calling convention for Windows x86 is described here.

"

So the current state is, in essence, that in C everything is PoD, and 
that's why that's also the case in D. Yes, something will need to change 
there.


Shachar


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-18 Thread kinke via Digitalmars-d

On Thursday, 17 May 2018 at 19:11:27 UTC, Shachar Shemesh wrote:

On 17/05/18 18:47, kinke wrote:
Since clang is able to compile this struct and do everything 
with it, and since the existence of the move constructor 
requires the precise same type of hooking as is needed in this 
case, I tend to believe that an IR representation of DIP 1014 
is possible.


I checked, and the reason is that D and C++ use a different ABI 
wrt. by-value passing of non-POD arguments. C++ indeed passes a 
reference to a caller-allocated rvalue, not just on Win64; that 
makes it trivial, as there are no moves across call boundaries. 
But your proposal may imply changing the D ABI accordingly.


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-18 Thread Rubn via Digitalmars-d

On Thursday, 17 May 2018 at 20:25:26 UTC, Shachar Shemesh wrote:
Obviously, Something can be an enum or a boolean. If it is, 
however, then we have to perform a condition to select the 
correct value. The problem with conditionals is that if the CPU 
misses a guess about what they are (and in our case, the CPU is 
going to miss about 50% of the time), they are extremely 
expensive to evaluate.


Performance wise, a much saner approach is:
alias Something = int*;

Of course, this means our struct now has a self referencing 
pointer.


What I'm getting at is that even if there are alternatives to 
structs pointing at themselves, they may not be performance 
wise comparable to pointers.



It's possible to do a branchless condition that chooses between 
two pointers. I think if the hardware (and compiler) support it 
it'll just optimize down to a "cmov".





Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-18 Thread Kagamin via Digitalmars-d

On Thursday, 17 May 2018 at 19:13:48 UTC, Shachar Shemesh wrote:
The only inherent non @safe thing we advocate here is if you 
want to be able to move const/immutable structs, in which case 
DIP 1014 advocates casting the constness away. That will, of 
course, have to be either @system or @trusted.


There's an idea to give const postblit ability to write, but it's 
difficult to make such exception for operator method, a possible 
syntax can be `this(this, const ref S prev)`


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread Manu via Digitalmars-d
On 17 May 2018 at 13:25, Shachar Shemesh via Digitalmars-d
 wrote:
> On 17/05/18 22:29, Manu wrote:
>>
>>
>> This is great!
>> I've wanted this on numerous occasions when interacting with C++ code.
>> This will make interaction more complete.
>>
>> Within self-contained D code, I have avoided self-pointers by using
>> self-offsets instead in the past (a bit hack-ey). But this nicely
>> tidies up one more little rough-edge in the language.
>>
>> :+1:
>>
>
> Following Andrei's advice, I've actually started writing a couple of
> examples to illustrate why this is needed.
>
> The first was this:
>
> struct S {
>   static int global;
>   int local;
>
>   Something selector; // Decides whether we want local or global.
> }
>
> Let's further assume that we have an array of S instances with random
> uniform distribution between local and global.
>
> Obviously, Something can be an enum or a boolean. If it is, however, then we
> have to perform a condition to select the correct value. The problem with
> conditionals is that if the CPU misses a guess about what they are (and in
> our case, the CPU is going to miss about 50% of the time), they are
> extremely expensive to evaluate.
>
> Performance wise, a much saner approach is:
> alias Something = int*;
>
> Of course, this means our struct now has a self referencing pointer.
>
> What I'm getting at is that even if there are alternatives to structs
> pointing at themselves, they may not be performance wise comparable to
> pointers.
>
> Of course, the second example was a struct that has no internal pointers,
> but rather maintains global pointers pointing to it. This problem is quite a
> bit harder to solve.
>
> Shachar

Oh yeah, I totally recognise that there are instances where a local
offset is not a solution, I was just saying how I've managed to
avoided dealing with this issue before, and not that I loved doing so
that way ;)
This would be a very welcome fix, particularly in code where I
interact with C++!


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread Shachar Shemesh via Digitalmars-d

On 17/05/18 22:29, Manu wrote:


This is great!
I've wanted this on numerous occasions when interacting with C++ code.
This will make interaction more complete.

Within self-contained D code, I have avoided self-pointers by using
self-offsets instead in the past (a bit hack-ey). But this nicely
tidies up one more little rough-edge in the language.

:+1:



Following Andrei's advice, I've actually started writing a couple of 
examples to illustrate why this is needed.


The first was this:

struct S {
  static int global;
  int local;

  Something selector; // Decides whether we want local or global.
}

Let's further assume that we have an array of S instances with random 
uniform distribution between local and global.


Obviously, Something can be an enum or a boolean. If it is, however, 
then we have to perform a condition to select the correct value. The 
problem with conditionals is that if the CPU misses a guess about what 
they are (and in our case, the CPU is going to miss about 50% of the 
time), they are extremely expensive to evaluate.


Performance wise, a much saner approach is:
alias Something = int*;

Of course, this means our struct now has a self referencing pointer.

What I'm getting at is that even if there are alternatives to structs 
pointing at themselves, they may not be performance wise comparable to 
pointers.


Of course, the second example was a struct that has no internal 
pointers, but rather maintains global pointers pointing to it. This 
problem is quite a bit harder to solve.


Shachar


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread Manu via Digitalmars-d
On 17 May 2018 at 01:12, Mike Parker via Digitalmars-d
 wrote:
> This is the review thread for the first Community Review round for DIP 1014,
> "Hooking D's struct move semantics".
>
> All review-related feedback on and discussion of the DIP should occur in
> this thread. The review period will end at 11:59 PM ET on May 31, or when I
> make a post declaring it complete.
>
> At the end of Round 1, if further review is deemed necessary, the DIP will
> be scheduled for another round. Otherwise, it will be queued for the formal
> review and evaluation by the language maintainers.
>
> Please familiarize yourself with the documentation for the Community Review
> before participating.
>
> https://github.com/dlang/DIPs/blob/master/PROCEDURE.md#community-review
>
> Thanks in advance to all who participate.
>
> Destroy!

This is great!
I've wanted this on numerous occasions when interacting with C++ code.
This will make interaction more complete.

Within self-contained D code, I have avoided self-pointers by using
self-offsets instead in the past (a bit hack-ey). But this nicely
tidies up one more little rough-edge in the language.

:+1:


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread Shachar Shemesh via Digitalmars-d

On 17/05/18 19:08, Kagamin wrote:

On Thursday, 17 May 2018 at 13:50:26 UTC, Shachar Shemesh wrote:
There is no such use case. Please remember that at the time opPostMove 
is called, both new and old memory are still allocated.


That's an interesting moment too. A struct that was supposed to be moved 
is copied instead and exists in two places simultaneously. Can't tell it 
yet, but it can have a hole in type system and opPostMove can only be 
trusted, not safe.


It is a hole (of sorts) in the type system, in that "old" is not going 
to have a destructor run on its code.


With that said, just because the code is not safe, does not mean it is 
not @safe.


The only inherent non @safe thing we advocate here is if you want to be 
able to move const/immutable structs, in which case DIP 1014 advocates 
casting the constness away. That will, of course, have to be either 
@system or @trusted.


Shachar


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread Shachar Shemesh via Digitalmars-d

On 17/05/18 18:47, kinke wrote:

On Thursday, 17 May 2018 at 15:23:50 UTC, kinke wrote:

See IR for https://run.dlang.io/is/1JIsk7.


I should probably emphasize that the LLVM `byval` attribute is strange 
at first sight. Pseudo-IR `void foo(S* byval param); ... foo(S* byarg 
arg);` doesn't mean that the IR callee gets the S* pointer from the IR 
callsite; it means 'memcpy(param, arg, S.sizeof)', with `param` being an 
*implicit* address in foo's parameters stack (calculated by LLVM and so 
exposed to the callee only). That's the difficulty for LDC I mentioned 
earlier.


I understand there might be difficulty, but I strongly protest the idea 
that it is not possible, for one very simple reason: C++.


class Movable {
  int member;

public:
  Movable();

  Movable( const Movable &rhs ); // Copy constructor
  Movable( Movable &&rhs ); // Move constructor
}

Since clang is able to compile this struct and do everything with it, 
and since the existence of the move constructor requires the precise 
same type of hooking as is needed in this case, I tend to believe that 
an IR representation of DIP 1014 is possible.


Shachar


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread Kagamin via Digitalmars-d

On Thursday, 17 May 2018 at 13:50:26 UTC, Shachar Shemesh wrote:
There is no such use case. Please remember that at the time 
opPostMove is called, both new and old memory are still 
allocated.


That's an interesting moment too. A struct that was supposed to 
be moved is copied instead and exists in two places 
simultaneously. Can't tell it yet, but it can have a hole in type 
system and opPostMove can only be trusted, not safe.


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread kinke via Digitalmars-d

On Thursday, 17 May 2018 at 15:23:50 UTC, kinke wrote:

See IR for https://run.dlang.io/is/1JIsk7.


I should probably emphasize that the LLVM `byval` attribute is 
strange at first sight. Pseudo-IR `void foo(S* byval param); ... 
foo(S* byarg arg);` doesn't mean that the IR callee gets the S* 
pointer from the IR callsite; it means 'memcpy(param, arg, 
S.sizeof)', with `param` being an *implicit* address in foo's 
parameters stack (calculated by LLVM and so exposed to the callee 
only). That's the difficulty for LDC I mentioned earlier.


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread kinke via Digitalmars-d

On Thursday, 17 May 2018 at 12:36:29 UTC, Shachar Shemesh wrote:
Again, as far as I know, structs are not copied when passed as 
arguments. They are allocated on the caller's stack and a 
reference is passed to the callee. If that's the case, no move 
(of any kind) is done.


That's the exception to the rule (LDC's `ExplicitByvalRewrite`), 
and true for structs > 64 bit on Win64 (and some more structs) 
and something similar for AArch64. No other ABIs supported by LDC 
pass a low-level pointer to a caller-allocated copy for 
high-level pass-argument-by-value semantics; the argument is 
normally moved to the function parameter (in the callEE 
parameters stack).


```
struct S
{
size_t a, b;
this(this) {} // no POD anymore
}

void foo(S param);

void bar()
{
// allocate a temporary on the caller's stack and move it to 
the callee

foo(S(1, 2));

S lvalue;
// copy lvalue to a temporary on the caller's stack (incl. 
postblit call)

// and then move that temporary to the callee
foo(lvalue);

import std.algorithm.mutation : move;
// move move()-rvalue-result to the callee
foo(move(lvalue));
}
```

'Move to callee' for most ABIs meaning a bitcopy/blit to the 
callee's memory parameters stack, for LDC via LLVM `byval` 
attribute.


See IR for https://run.dlang.io/is/1JIsk7.


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread Shachar Shemesh via Digitalmars-d

On 17/05/18 16:42, Kagamin wrote:
Looks like requirement for @nogc @safe has no consequence as the DIP 
suggests to infer them anyway. On ideological side safety can't be a 
requirement because it would contradict its purpose of providing 
guarantee.


I think you are confusing __move_post_blt's implementation (by druntime) 
with opPostMove's implementation by the user.


For the former, these attributes are deducted by the compiler. For the 
later, the user may choose to include them for all of the usual reasons 
for including them, not least of which is that if she does not include 
@nogc, then trying to move a struct's instance from @nogc code will 
cause compilation failure.


> Especially if the suggested use case is handling of dangling
> pointers.

There is no such use case. Please remember that at the time opPostMove 
is called, both new and old memory are still allocated.


Shachar


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread Kagamin via Digitalmars-d
Looks like requirement for @nogc @safe has no consequence as the 
DIP suggests to infer them anyway. On ideological side safety 
can't be a requirement because it would contradict its purpose of 
providing guarantee. Especially if the suggested use case is 
handling of dangling pointers.


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread Shachar Shemesh via Digitalmars-d

I'm not sure I follow all of your comments.

For the rest my comments, let's assume that the compiler may assume that 
__move_post_blt is a no-op if hasElaborateMove returns false.


On 17/05/18 14:33, kinke wrote:
3. When deciding to move a struct instance, the compiler MUST emit a 
call to the struct's __move_post_blt after blitting the instance and 
before releasing the memory containing the old instance. 
__move_post_blt MUST receive references to both the pre- and post-move 
instances.


This implies that such structs must not be considered PODs, i.e., cannot 
be passed in registers and must be passed on the stack.


I'm not familiar with passing structs in registers. I am familiar with 
passing pointer to the structs in registers, which should not be 
affected by this.


If actual (I'm assuming short) structs can be passed in registers, then, 
yes, structs with elaborate move are not PoDs.


It also means 
that the compiler will have to insert a __move_post_blt call right 
before the call (as the callee has no idea about the old address),


Again, as far as I know, structs are not copied when passed as 
arguments. They are allocated on the caller's stack and a reference is 
passed to the callee. If that's the case, no move (of any kind) is done.


I might be missing something. Can you write a code snippet to which you 
are referring?


after 
blitting the arg to the callee params stack; this may be tricky to 
implement for LDC, as that last blit is implicit in LLVM IR (LLVM byval 
attribute).


And yet, C++ clang managed to do it in classes with && constructors.


As a side note, when passing a postblit-struct lvalue arg by value,


Just to be clear, are we talking about passing by reference an instance 
of a struct that has postblit?


the 
compiler first copies the lvalue



to a temporary on the caller's stack, 
incl. postblit call, and then moves that copy to the callee. So this 
requires either a postblit+postmove combo on the caller side before the 
actual call, or a single postblit call for the final address (callee's 
param).


Again, that does not align with my familiarity of the ABI. If I do:

struct S {
  ...

  this(this) {
// some code
  }

  void opPostMove(ref S new, const ref S old) {
// some code
  }
}

void func1(ref S arg) {
}

void func2(S arg) {
}


As far as I know the ABI, in func1, a pointer to S is passed. In func2, 
a pointer to caller stack allocate instance is passed, and the original 
is copied in. It sounds to me like you are talking about the case of:


S s;

func2(s);

in which case you need to copy s to a temporary, and then pass a pointer 
to that temporary to func2. I don't see where a move enters here.


However, if a move does enter here (and one is necessary if, for 
example, func2's frame needs to be dynamically allocated because an 
escaping delegate references it), then, yes, an opPostMove will also 
need to be called.


Again, if hasElaborateMove returns false, then no change from current 
behavior is needed.


Shachar


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread kinke via Digitalmars-d
3. When deciding to move a struct instance, the compiler MUST 
emit a call to the struct's __move_post_blt after blitting the 
instance and before releasing the memory containing the old 
instance. __move_post_blt MUST receive references to both the 
pre- and post-move instances.


This implies that such structs must not be considered PODs, i.e., 
cannot be passed in registers and must be passed on the stack. It 
also means that the compiler will have to insert a 
__move_post_blt call right before the call (as the callee has no 
idea about the old address), after blitting the arg to the callee 
params stack; this may be tricky to implement for LDC, as that 
last blit is implicit in LLVM IR (LLVM byval attribute).


As a side note, when passing a postblit-struct lvalue arg by 
value, the compiler first copies the lvalue to a temporary on the 
caller's stack, incl. postblit call, and then moves that copy to 
the callee. So this requires either a postblit+postmove combo on 
the caller side before the actual call, or a single postblit call 
for the final address (callee's param).


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread Shachar Shemesh via Digitalmars-d

On 17/05/18 11:22, rikki cattermole wrote:


What is the benefit of opPostMove over copy constructors (not postblit)?


The two are unrelated. A copy is a very different operation from a move. 
With a copy, you have to figure out how to duplicate the resources used 
by the object. With a move, no such duplication is needed.


People are somewhat conditioned to treat a move as a "copy+destroy". One 
certainly may implement it that way. There is a lot of power in not 
having to do it, though. Think a struct with "@disable this(this)".


The D compiler does moves of structs, whether they are copyable or not. 
This DIP is about being able to hook this move and fix external and 
internal references.


Shachar


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread rikki cattermole via Digitalmars-d

On 17/05/2018 8:12 PM, Mike Parker wrote:
This is the review thread for the first Community Review round for DIP 
1014, "Hooking D's struct move semantics".


All review-related feedback on and discussion of the DIP should occur in 
this thread. The review period will end at 11:59 PM ET on May 31, or 
when I make a post declaring it complete.


At the end of Round 1, if further review is deemed necessary, the DIP 
will be scheduled for another round. Otherwise, it will be queued for 
the formal review and evaluation by the language maintainers.


Please familiarize yourself with the documentation for the Community 
Review before participating.


https://github.com/dlang/DIPs/blob/master/PROCEDURE.md#community-review

Thanks in advance to all who participate.

Destroy!


What is the benefit of opPostMove over copy constructors (not postblit)?


DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread Mike Parker via Digitalmars-d
This is the review thread for the first Community Review round 
for DIP 1014, "Hooking D's struct move semantics".


All review-related feedback on and discussion of the DIP should 
occur in this thread. The review period will end at 11:59 PM ET 
on May 31, or when I make a post declaring it complete.


At the end of Round 1, if further review is deemed necessary, the 
DIP will be scheduled for another round. Otherwise, it will be 
queued for the formal review and evaluation by the language 
maintainers.


Please familiarize yourself with the documentation for the 
Community Review before participating.


https://github.com/dlang/DIPs/blob/master/PROCEDURE.md#community-review

Thanks in advance to all who participate.

Destroy!


Re: DIP 1014:Hooking D's struct move semantics--Community Review Round 1

2018-05-17 Thread Mike Parker via Digitalmars-d

On Thursday, 17 May 2018 at 08:12:50 UTC, Mike Parker wrote:
This is the review thread for the first Community Review round 
for DIP 1014, "Hooking D's struct move semantics".


And the link to the DIP:

https://github.com/dlang/DIPs/blob/38cec74a7471735559e3b8a7553f55102d289d28/DIPs/DIP1014.md