Re: Safe reference counting cannot be implemented as a library

2015-10-29 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 27 October 2015 at 18:10:18 UTC, deadalnix wrote:
I've made the claim that we should implement reference counting 
as a library many time, so I think I should explicit my 
position. Indeed, RC require some level a compiler support to 
be safe. That being said, the support does not need to be 
specific to RC. On fact, my position is that the language 
should provide some basic mechanism on top of which safe RC can 
be implemented, as a library.


The problem at hand here is escape analysis. The compiler must 
be able to ensure that a reference doesn't escape the RC 
mechanism in an uncontrolled manner. I'd like to add such 
mechanism to the language rather than bake in reference 
counting, as it can be used to solve other problem we are 
facing today (@nogc exception for instance).


Here's a link to the reference safety system I proposed some 
months ago:


http://forum.dlang.org/post/offurllmuxjewizxe...@forum.dlang.org

I'm very far from having the expertise needed to know whether it 
would be worth its weight in practice, but it was better to write 
it out than to keep it bottled up in my head. I hope it will be 
of some use.


Re: Mitigating the attribute proliferation - attribute inference for functions

2015-04-12 Thread Zach the Mystic via Digitalmars-d

Please scan this thread for any useful ideas:

http://forum.dlang.org/post/vlzwhhymkjgckgyox...@forum.dlang.org

I don't have the technical expertise to know if it's useful or 
could work. The basic suggestion is that D has a function 
attribute which expressly indicates that a function is separately 
compiled, thus eliminating all ambiguity and mystery about what 
can and can't be inferred.


Re: const as default for variables

2015-03-18 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 18 March 2015 at 09:28:35 UTC, deadalnix wrote:
On Wednesday, 18 March 2015 at 06:24:38 UTC, Zach the Mystic 
wrote:
I'm starting to think that refcounting is precisely the 
opposite of ownership, useful only for when its *impossible* 
to track ownership easily. Otherwise why would you need a 
refcount?




It is not the language's problem. If the language defines 
ownership, the you can define all kind of RC systems as library 
type by deferring the ownership of things to the RC library.


The good thing about it is that it doesn't limit the library 
solution to be ref counting, but it can be anything else, or 
any refcounting strategy.


Indeed, internally, the RC system have to play unsafe, but as 
long as it has to free, it has to play unsafe anyway. The 
important point is that it can provides a safe interface to the 
outside world.


The inc/dec elision problem is simply a copy optimization 
problem. Framing it as a refcounting problem is the wrong way 
to think about it. You would like to elide copy as much as 
possible.


The first element for this is borrowing. You can pass borrowed 
things around without needing to have copies.


So let's go through the steps. Question: What parts of borrowing 
are internally kept track of by the compiler, and what parts are 
made manifest in code? For what is made manifest, how do they 
appear -- as type qualifiers, i.e. `borrowed`, `owned` -- or 
built-in properties, e.g. `fun(x.borrowed)`? For things kept 
hidden, we need to find potential sources of ambiguity, and 
derive reliable algorithms to resolve them.


For me, a big issue is passing variables as arguments, because 
the compiler can't read into the function to see what it does, 
and the function can only tell the caller what the attribute 
system allows. What if the caller takes a wrapped type and you 
only have an unwrapped version, or a different wrapped version, 
to pass to it? Should there be any way to pass it transparently 
(i.e for the called type to automatically receive the passed type 
of the argument), or does it have to be created manually? (I was 
thinking about this when Andrei was trying to create smart 
pointers, and wondered what it would take to create a`Ref!` type 
to entirely replace the `ref` storage class.) This may or may not 
be related to a fully effective ownership system.


The general problem of the assignation comes up when something 
is borrowed several time and assigned. const is obviously a 
situation where we can elide when borrowing, but that is not 
the only one.


In that situation, only borrowing the RC wrapper require a copy 
(borrowing the wrapped do not). Note that borrowing the wrapped 
is most likely what you want in the first place in most 
situation (so the code manipulating the borrowed do not need to 
rely on a specific memory management scheme, which allow for 
versatile libraries) so copy elision is what you'll in most 
situation as well.


I guess this will most often be accomplished with `alias this` 
when passing to an argument? I guess what you're suggesting is 
that if a function may delete a reference, you can detect this 
because it accepts only a fully RC'd type rather than the 
unwrapped version.


Solid core constructs are much better that attribute 
proliferation.


I totally agree, but at this point, we must figure out precisely 
which constructs to ask for, and then convince everyone else of 
their worth.


How did you become convinced of the value of built-in ownership? 
Is there a good article you could point me to? Secondly, what do 
you suggest it would look like in D? Type qualifiers, a storage 
class, function/parameter attributes? How much just takes place 
invisibly to the programmer?


Re: The next iteration of scope

2015-03-18 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 18 March 2015 at 13:01:50 UTC, Oren Tirosh wrote:

On Sunday, 15 March 2015 at 19:11:36 UTC, Marc Schütz wrote:

On Sunday, 15 March 2015 at 17:31:17 UTC, Nick Treleaven wrote:
On 15/03/2015 14:10, Marc =?UTF-8?B?U2Now7x0eiI=?= 
schue...@gmx.net wrote:

Here's the new version of my scope proposal:
http://wiki.dlang.org/User:Schuetzm/scope2

It's still missing real-life examples, a section on the 
implementation,
and a more formal specification, as well as a discussion of 
backwards
compatibility. But I thought I'd show what I have, so that 
it can be

discussed early on.

I hope it will be more digestible for Walter  Andrei. It's 
more or less
an extended version of DIP25, and avoids the need for most 
explicit

annotations.


I too want a scope attribute e.g. for safe slicing of static 
arrays, etc. I'm not sure if it's too late for scope by 
default though, perhaps.


If we get @safe by default, we automatically get scope by 
default, too.


The scope storage class is a two way contract. The function 
promises not to escape the reference. The caller promises to 
ensure the storage that the reference is pointing to will 
remain valid for the duration of the function call. In some 
cases, the caller code may need to take active steps to ensure 
that, like keeping an otherwise temporary reference alive to 
prevent it from being deallocated.


But what if the pointer is null? Can this be considered to 
fulfill the caller's part of the deal?


Yes, the old @notnull debate again. For me, @safe by default 
and scope by default also suggests @notnull by default for 
scope references. Sorry if this opens up directions you don't 
want to think about at the moment...


So far, null pointers haven't been a big part of the discussion. 
By the existing definition, a null pointer is memory safe, 
because it doesn't point to anything. But they are obviously a 
problem in their own right.


Re: Phobos Documentation - call to action

2015-03-18 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 18 March 2015 at 03:45:07 UTC, Walter Bright wrote:

The bad news: the Phobos documentation sux.

The good news: we can make things a lot better by just filling 
in blanks. For example, picking a function largely at random:


  http://dlang.org/phobos/std_uni.html#sicmp

There is no Params section, no Returns: section, and no 
See_Also section. Hence, I wrote a PR for it:


  https://github.com/D-Programming-Language/phobos/pull/3060

There's nothing clever about it, just filling in the blanks. If 
we all pitch in, we can substantially improve the documentation.


Some guidelines:

1. The sections Params, Returns, and See_Also need to be there. 
(Unless there are no parameters, or a void return.)


2. One PR per function being fixed.

3. Resist the urge to do more, stay focused simply on filling 
in the blanks, one PR per function, making things easy to 
review.


No responses yet -- not that I'm any less guilty than anyone 
else. But maybe this needs to be bumped up to a higher priority 
-- a hiatus on internal development for a couple weeks solely to 
bring documentation up to a minimum. Obviously clear guidelines 
like the ones you just posted are a plus.


Re: Phobos Documentation - call to action

2015-03-18 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 18 March 2015 at 20:19:10 UTC, Walter Bright wrote:

On 3/18/2015 12:42 PM, Zach the Mystic wrote:

But why, therefore, is it so hard to get movement on it?


I don't know why, so I'll ask. Why haven't you submitted a PR 
to fix one of them? :-)


I have pathetically little experience with most of phobos. I most 
certainly hold the record for amount of passion associated with 
the D language versus number of lines actually coded in it. That 
said, it can't be that hard to figure out what the parameters are 
and what they return. If you give me a specific module, I'll 
start making pull requests for it.


Re: Phobos Documentation - call to action

2015-03-18 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 18 March 2015 at 18:09:07 UTC, Walter Bright wrote:

On 3/18/2015 10:55 AM, Zach the Mystic wrote:
No responses yet -- not that I'm any less guilty than anyone 
else. But maybe
this needs to be bumped up to a higher priority -- a hiatus on 
internal
development for a couple weeks solely to bring documentation 
up to a minimum.
Obviously clear guidelines like the ones you just posted are a 
plus.


We have a great language, but represent it poorly in the 
documentation. Every library entry also needs a pithy example 
(or even any example at all), but I thought we could make 
progress first by simply documenting what the return value is 
supposed to be.


We also need to stop pulling new library additions that have 
obviously inadequate documentation.


I'm just thinking in terms of psychology. I haven't seen anyone 
disagree that the documentation is inadequate, so that's not even 
disputed.


But why, therefore, is it so hard to get movement on it? I 
suspect that it's because it is perceived as a chore, like 
cleaning a barn. I don't want to go in that barn by myself. But 
if I everyone's doing it, with the mutual understanding that it 
needs to get done - and no one is exempt - then it doesn't feel 
so bad.


At some point, it must be possible for documentation to get so 
bad that *nothing* is more important. Otherwise, it may well 
continue to flounder in destitute obscurity, never receiving the 
attention it deserves.


Re: Phobos Documentation - call to action

2015-03-18 Thread Zach the Mystic via Digitalmars-d
On Wednesday, 18 March 2015 at 19:50:24 UTC, Andrei Alexandrescu 
wrote:

On 3/18/15 12:42 PM, Zach the Mystic wrote:
At some point, it must be possible for documentation to get so 
bad that

*nothing* is more important.


Strategically we're definitely there, and have been for a while 
(if we define improving D's rate of adoption as important).


Yeah, it appears so.


A body of examples of idiomatic uses of the language is missing.


Unfortunately, I get the sense that's not the only thing that's 
missing.


Full disclosure: I'm not an experienced team leader, so I can't 
promise my suggestion will work.


That said, I suggest, for the purpose of turning motive into 
action, a ten-day Documentation Holiday, akin to Franklin D. 
Roosevelt's Bank Holiday of 1933:


http://en.wikipedia.org/wiki/Emergency_Banking_Act

Guidelines for enhancements must be drawn up and made clear in 
advance, and the community given sufficient notice to prepare for 
the holiday.


It's just an idea... but as you say, D is already there, 
strategically. I didn't feel great about having to be the first 
to respond to this thread, since I'm not a major contributor 
(yet, anyway) - it looks like the sign of a real problem.


Re: const as default for variables

2015-03-18 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 17 March 2015 at 22:53:20 UTC, deadalnix wrote:
On Tuesday, 17 March 2015 at 22:25:30 UTC, Zach the Mystic 
wrote:
The real devil against safe reference counting is in the 
assignment operators, when they do destructive moves. I think 
those have to be the focus of any effort here.


I'm trying to imagine a parameter attribute `@destroy`, for 
example, indicating that its reference may get destroyed. Not 
sure if it will work, or even help, but it's a start.


That is the wrong approach. This is a know problem and there is 
a known solution: ownership. If we are going to add something 
in the language to handle it, then it has to be ownership.


Just so we're clear, there are two problems. One is making 
ref-counting safe. The other is making it fast, by eliding 
unnecessary operations. The issue I'm worried about is when you 
pass an RC'd type as an argument by value, for example, you make 
a copy. To be safe the compiler should wrap the original in an 
inc/dec cycle for the duration of the call. But this is a waste 
if there's no risk of reassignment, if you're just mutating the 
original data, or if the type isn't even an RC'd type but has 
some other kind of destructor. My guess was that `@destroy` could 
help the compiler elide unnecessary cycles this way.


If you always pass by reference (e.g. `ref`), you're sending the 
original, rather than copying it. This needs no wrapping 
therefore, since any reassignment will affect the original. What 
good would ownership do in that case? Any normal copying will 
increase the refcount anyway.


I'm starting to think that refcounting is precisely the opposite 
of ownership, useful only for when its *impossible* to track 
ownership easily. Otherwise why would you need a refcount?


What would be really interesting is a combined system, where the 
compiler detects the ownership properties of any given variable, 
and automatically decides whether it needs to be refcounted or 
not. There could be a built-in template in the runtime, e.g. a 
`_refCounted(T)`, which must be a perfect drop-in replacement for 
a regular `T` in all cases -- difficult, yes, but interesting to 
imagine at least -- which the compiler would swap in at its 
discretion. Obviously a huge flight of fancy, given that D is not 
in the habit of altering the basic type of a variable based on 
how it used... but it would be very efficient if it worked.


Do you agree that refcounting and ownership oppose each other, 
that refcounting only makes sense when ownership is impossible? 
That refcounting is a runtime mechanism for tracking precisely 
what a compile time ownership system can't? In other words, what 
problems does ownership solve, and how?


Re: `return const` parameters make `inout` obsolete

2015-03-17 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 17 March 2015 at 18:27:01 UTC, ixid wrote:
To be fully viable, `return` would have to be secretly 
recorded as part of the `x's type, so that the compiler could 
forgive returning it to a non-const. But the compiler should 
probably track that `x` is copied from `t` anyway, so that it 
can verify `return t` when it returns `x`, and the same 
information would be used to forgive `x's constness.


But yeah, there might still be a use for `inout`.


Why is this ability important? It feels like trying to distort 
non-templates into templates. Is this the alternative to using 
templates or repeating yourself or are there other important 
aspects to it?


I don't know for sure. I think the main point of `inout` is to 
avoid returning a copy of a reference that's mutable and 
assigning it to an immutable. When there is no copy of a 
reference (i.e. it's unique), or if you know that all possible 
copies are immutable, there's no problem. Even an immutable 
reference with lifetime shorter than the `const` value it copies 
is okay.


In other words, it seems like there are a lot of cases where you 
can assign something that returns a regular `T*` to an 
`immutable(T*)` safely.


Re: const as default for variables

2015-03-17 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 17 March 2015 at 19:53:14 UTC, deadalnix wrote:

On Tuesday, 17 March 2015 at 13:55:36 UTC, Dejan Lekic wrote:
I definitely think this is a good idea. And if someone wants 
mutable variable, we simply use proposed 'var' storage class.


Brilliant!


This is going to break pretty much all the code that use auto.

The benefice for the compiler is hypothetical. Walter is right 
when mentioning that is can remove some refcount boilerplate, 
which is right but, we have no idea how much, and how good the 
compiler would be at recognizing them. With existing languag 
feature, the compiler CANNOT leverage this change for 
optimization.


The real devil against safe reference counting is in the 
assignment operators, when they do destructive moves. I think 
those have to be the focus of any effort here.


I'm trying to imagine a parameter attribute `@destroy`, for 
example, indicating that its reference may get destroyed. Not 
sure if it will work, or even help, but it's a start.


Re: `return const` parameters make `inout` obsolete

2015-03-17 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 17 March 2015 at 12:02:15 UTC, Nick Treleaven wrote:

On 16/03/2015 14:17, Zach the Mystic wrote:

char* fun(return const char* x);

Compiler has enough information to adjust the return type of 
`fun()` to
that of input `x`. This assumes return parameters have been 
generalized

to all reference types. Destroy.


inout can be used for local variables too.


Yeah that might be a use for it.

inout(T*) fun(inout(T*) t) {
  inout(T*) x = t;
  return x;
}
--
T* gun(return const T* t) {
  const(T*) x = t;
  return x;
}

To be fully viable, `return` would have to be secretly recorded 
as part of the `x's type, so that the compiler could forgive 
returning it to a non-const. But the compiler should probably 
track that `x` is copied from `t` anyway, so that it can verify 
`return t` when it returns `x`, and the same information would be 
used to forgive `x's constness.


But yeah, there might still be a use for `inout`.


Re: The next iteration of scope

2015-03-16 Thread Zach the Mystic via Digitalmars-d

On Monday, 16 March 2015 at 20:50:46 UTC, Marc Schütz wrote:

On Monday, 16 March 2015 at 19:43:01 UTC, Zach the Mystic wrote:

I always tend to think of member functions as if they weren't:

struct S {
 T t;
 ref T fun() return {
   return t;
 }
}

In my head, I just translate fun() above to:

ref T fun(return S* __this) {
 return __this.t;
}

Therefore whatever the scope of `__this`, that's the scope of 
the return, just like it would be for any other parameter. 
Then:


S s;
s.fun();

... is really just `fun(s);` in disguise. That's why it's hard 
for me to grasp `scope` members, because they seem to me to be 
just as scope as their parent, whether global or local.


It works just the same:

struct S {
private int* payload_;
ref int* payload() return {
return payload_;
}
}

ref int* payload(scope ref S __this) return {
return __this.payload_;// well, imagine it's not private
}


More accurately,

// `return` is moved
ref int* payload(return scope ref S __this) {
   return __this.payload_;
}

I think that if you need `return` to make it safe, there's much 
less need for `scope`.


Both the S.payload() and the free-standing payload() do the 
same thing.


From inside the functions, `return` tells us that we're allowed 
to a reference to our payload. From the caller's point of view, 
it signifies that the return value is scoped to the first 
argument, or `this` respectively.


To reiterate, `scope` members are just syntactical sugar for 
the kinds of accessor methods/functions in the example code. 
There's nothing special about them.


That's fine, but then there's the argument that syntax sugar is 
different from real functionality. To add it would require a 
compelling use case.


My fundamental issue with `scope` in general is that it should be 
the safe default, which means it doesn't really need to appear 
that often. If @safe is default, the compiler would force you to 
mark any parameter `return` when it detected such a return.


How a member could be scope when the parent is global is hard 
for me to imagine.


The following is clear, right?

int* p;
scope int* borrowed = p;

That's clearly allowed, we're storing a reference to a global 
or GC object into a scope variable. Now let's use `S`, which 
contains an `int*` member:


S s;
scope S borrowed_s = s;

That's also ok. Doesn't matter whether it's the pointer itself, 
or something containing the pointer. And now the final step:


scope int* p2;
p2 = s.payload;  // OK
p2 = borrowed_s.payload; // also OK
static int* p3;
p3 = s.payload;  // NOT OK!

However, if `payload` were not the accessor method/function, 
but instead a simple (non-scope) member of `S`, that last line 
would be allowed, because there is nothing restricting its use.


See above. With `return` being forced on the implicit this 
parameter:


ref int* payload(return /*scope*/ ref S __this) { ... }

`return` covers the need for safety, unless I'm still missing 
something.


For members that the struct owns and want's to manage itself, 
this is not good. Therefore, we make it private and allow 
access to it only through accessor methods/functions that are 
annotated with `return`. But we could accidentally forget an 
annotation, and the pointer could escape.


Same argument. Forgetting `return` in safe code == compiler 
error. I think DIP25 already does this.


Re: const as default for variables

2015-03-16 Thread Zach the Mystic via Digitalmars-d

On Monday, 16 March 2015 at 19:52:00 UTC, deadalnix wrote:

On Monday, 16 March 2015 at 14:40:51 UTC, Zach the Mystic wrote:

On Sunday, 15 March 2015 at 20:09:56 UTC, deadalnix wrote:

On Sunday, 15 March 2015 at 07:44:50 UTC, Walter Bright wrote:
const ref can tell the optimizer that that path for a ref 
counted object cannot alter its ref count.


That is not clear why. const ref is supposed to protect 
against escaping when ref does not ?


There are two cases here. One is when the reference is copied 
to new variable, which would actually break const because the 
reference count of the original data would have to be 
incremented (which is a separate issue).


I think we should provide library solution for this kind of 
things.


Changing the reference count is a very low-level operation. I'm 
not sure how to go about breaking the type system in order to 
support `const` variations on it.


`return const` parameters make `inout` obsolete

2015-03-16 Thread Zach the Mystic via Digitalmars-d

char* fun(return const char* x);

Compiler has enough information to adjust the return type of 
`fun()` to that of input `x`. This assumes return parameters have 
been generalized to all reference types. Destroy.


Re: const as default for variables

2015-03-16 Thread Zach the Mystic via Digitalmars-d

On Sunday, 15 March 2015 at 20:09:56 UTC, deadalnix wrote:

On Sunday, 15 March 2015 at 07:44:50 UTC, Walter Bright wrote:
const ref can tell the optimizer that that path for a ref 
counted object cannot alter its ref count.


That is not clear why. const ref is supposed to protect against 
escaping when ref does not ?


There are two cases here. One is when the reference is copied to 
new variable, which would actually break const because the 
reference count of the original data would have to be incremented 
(which is a separate issue). But the other case is where the 
original is reassigned, in which the counter for the data it used 
to point to gets decremented, possibly to zero. `const` would 
guarantee against this. But even this is a blunt force weapon, 
because it would also stop you from mutating the original data, 
even though that wouldn't change the reference count.


Re: `return const` parameters make `inout` obsolete

2015-03-16 Thread Zach the Mystic via Digitalmars-d

On Monday, 16 March 2015 at 14:23:42 UTC, ketmar wrote:

On Mon, 16 Mar 2015 14:17:57 +, Zach the Mystic wrote:


char* fun(return const char* x);

Compiler has enough information to adjust the return type of 
`fun()` to
that of input `x`. This assumes return parameters have been 
generalized

to all reference types. Destroy.


but why compiler has to rewrite return type? i never told it to 
do that!


It has to if you pass an immutable to x, which you're allowed to 
do. It only gives an error if you assign the result to a mutable 
variable. The point is that the signature still contains all the 
information it needs without `inout`. What old errors will fail 
to be reported and what new errors would it cause? I haven't been 
able to think of any.


Re: `return const` parameters make `inout` obsolete

2015-03-16 Thread Zach the Mystic via Digitalmars-d

On Monday, 16 March 2015 at 15:39:39 UTC, ketmar wrote:

On Mon, 16 Mar 2015 15:33:40 +, Zach the Mystic wrote:


On Monday, 16 March 2015 at 14:23:42 UTC, ketmar wrote:

On Mon, 16 Mar 2015 14:17:57 +, Zach the Mystic wrote:


char* fun(return const char* x);

Compiler has enough information to adjust the return type of 
`fun()`
to that of input `x`. This assumes return parameters have 
been

generalized to all reference types. Destroy.


but why compiler has to rewrite return type? i never told it 
to do

that!


It has to if you pass an immutable to x, which you're allowed 
to do. It
only gives an error if you assign the result to a mutable 
variable. The
point is that the signature still contains all the information 
it needs
without `inout`. What old errors will fail to be reported and 
what new

errors would it cause? I haven't been able to think of any.


this is the question of consistency. if i wrote `char* fun()`, 
i want fun
to return `char*`, and i'm not expecting it to change in a 
slightest. i

don't like when compiler starts to change things on it's own.


I think it's just less cluttered than `inout`. The simple fact is 
that if you try to assign an immutable variable to a mutable 
reference, you will still get an error. I doubt it would take 
long for programmers to adjust to the new way of reading the 
signatures. They just see `return` sitting there in front of 
`const` and know how to handle the situation.


Re: The next iteration of scope

2015-03-16 Thread Zach the Mystic via Digitalmars-d

On Monday, 16 March 2015 at 13:55:43 UTC, Marc Schütz wrote:
Also, what exactly does the `scope` on T payload get you? Is 
it just a more specific version of `return` on the this 
parameter, i.e. `return this.payload`? Why would you need that 
specificity? What is the dangerous operation it is intended to 
prevent?


Nick already answered that. I'll expand on his explanation:

Let's take the RC struct as an example. Instances of RC can 
appear with and without scope. Because structs members inherit 
the scope-ness from the struct, `payload` could therefore be an 
unscoped pointer. It could therefore be escaped 
unintentionally. By adding `scope` to its declaration, we force 
it to be scoped to the structs lifetime, no matter how it's 
accessed.


If an RC'd struct is heap-allocated, but one of its members 
points to the stack, how is it ever safe to escape it? Why 
shouldn't the heap variable be treated as scoped too, inheriting 
the most restricted scope of any of its members? To me, the 
question is not how you can know that a member is scoped, so much 
as how you could know that it *isn't* scoped, i.e. that a 
sub-pointer was global while the parent was local. I think it 
would require a very complicated type system:


struct S {
  T* p;
}

// note the elaborate new return signature
T* fun(return!(S.p) S x) {
  return x.p;
}

T* run() {
  S s;
  s.p = new T; // s local, s.p global
  return fun(s);
}

The above is technically safe, but the question is whether it's 
too complicated for the benefit. In the absence of such a 
complicated system, the safe default is to assume a struct is 
always as scoped as its most scoped member (i.e. transitive 
scoping). Your idea of `scope` members would only be valid in the 
absence of this safe default. But even then it would be of 
limited usefulness, because it would prevent all uses of global 
references in those members, even if the parent was global. For 
me, it comes down to that you can't know if anything is global or 
local until you define an instance of it, which you can't do in 
the struct definition.


Re: `return const` parameters make `inout` obsolete

2015-03-16 Thread Zach the Mystic via Digitalmars-d

On Monday, 16 March 2015 at 16:49:36 UTC, ketmar wrote:
having argument modifier that changes function return type is 
very
surprising regardless of how much people used to it. really, 
why should i
parse *arguments* to know the (explicitly specified!) *return* 
*type*?
it's ok with `auto`, it's ok with `inout`, but when i wrote 
`char *`, i
want `char *`. and then compiler decides that it knows better. 
ok,
compiler, you win, can you write the rest of the code for me? 
no? stupid

compiler!


I feel like you're reacting more to Change than to my actual 
point. `inout` wasn't invented because it looks good, but because 
it solves the DRY problem for different input types. `return` 
parameters also solve that problem, plus a few more, and with 
less DRY even than `inout`. I don't think either type if 
signature is that hard to read. It's just a matter of getting 
used to them.


Re: `return const` parameters make `inout` obsolete

2015-03-16 Thread Zach the Mystic via Digitalmars-d

On Monday, 16 March 2015 at 16:22:46 UTC, Marc Schütz wrote:

On Monday, 16 March 2015 at 14:17:58 UTC, Zach the Mystic wrote:

char* fun(return const char* x);

Compiler has enough information to adjust the return type of 
`fun()` to that of input `x`. This assumes return parameters 
have been generalized to all reference types. Destroy.


That's a very interesting observation. I never liked the name 
`inout`, as it doesn't describe what it does. The only downside 
I see is that it's more magic, because nothing on the return 
type says its mutability is going to depend on an argument.


I think Kenji also had additional plans for `inout`, related to 
uniqueness. There was a PR. Better ask him whether it's going 
to be compatible.


`return` would work just as well for uniqueness.

inout(T*) fun(inout(T*) x);
-
T* fun(return const T* x);

I don't think any information is being lost. My attitude is that 
unless you are losing information, the underlying logic won't be 
any harder to implement. I think that `return` parameters are a 
building block of `inout`, but more useful because they can be 
used separately from it. Perhaps in the early days of D, it just 
seemed too weird to have `return` parameters, but now with ref 
safety, they are better justified. But if they'd been there back 
then, `inout` probably wouldn't exist, since you can just build 
it in its current form from `const` and `return`.




Re: The next iteration of scope

2015-03-16 Thread Zach the Mystic via Digitalmars-d

On Monday, 16 March 2015 at 17:00:12 UTC, Marc Schütz wrote:
BUt there is indeed still some confusion on my side. It's about 
the question whether `this` should implicitly be passed as 
`scope` or not. Because if it is, scope members are probably 
useless, because they are already implied. I think I should 
remove this suggestion, because it would break too much code 
(in @system).


I always tend to think of member functions as if they weren't:

struct S {
  T t;
  ref T fun() return {
return t;
  }
}

In my head, I just translate fun() above to:

ref T fun(return S* __this) {
  return __this.t;
}

Therefore whatever the scope of `__this`, that's the scope of the 
return, just like it would be for any other parameter. Then:


S s;
s.fun();

... is really just `fun(s);` in disguise. That's why it's hard 
for me to grasp `scope` members, because they seem to me to be 
just as scope as their parent, whether global or local. How a 
member could be scope when the parent is global is hard for me to 
imagine.


Re: A few notes on choosing between Go and D for a quick project

2015-03-16 Thread Zach the Mystic via Digitalmars-d

On Monday, 16 March 2015 at 00:27:56 UTC, Walter Bright wrote:
I like the analogy of D being a fully equipped machine shop, as 
opposed to a collection of basic hand tools.


When I was younger it was hard working on my car, because I 
could not afford the right tools. So I made do with whatever 
was available. The results were lots of scrapes and bruises, 
much time invested, and rather crappy repairs. Now I can buy 
the right tools, and boy what a difference that makes! I can 
get professional quality results with little effort.


I agree with this. However, it actually implies a huge amount 
about what I would call D's brand. The fully equipped machine 
shop metaphor has some very serious tradeoffs when applied to 
computer programming languages, the steep learning curve required 
to use the machines correctly, for instance.


But I see advantage in this, because I can see a brand -- that 
is, an identity which distinguishes something from its rivals, 
not by flat-out superiority, but by its commitment to particulars 
-- for D here. I think D can market itself to a certain type of 
programmer, and win the language war by empowering this type of 
programmer, thereby inciting the envy of other types of 
programmers, who over time grudgingly concede the inferiority of 
their own styles and follow the herd.


Brands are all about types of people, rather than of products. I 
would love to see D consciously embrace its own kind of person, 
and not just because it feels good, but because of its value as a 
marketing strategy.


I see D attracting *really* good programmers, programmers from, 
let's say the 90-95th percentile in skill and talent in their 
field on average. By marketing to these programmers specifically 
-- that is, telling everyone that while D is for everyone, it is 
especially designed to give talented and experienced programmers 
the tools they need to get their work done -- even if you repel 
several programmers from, say, the 45th percentile or below in 
exchange for the brand loyalty of one from 92nd percentile or 
above, it's probably a winning strategy, because that one good 
programmer will get more done than all the rest combined.


Re: Smart references

2015-03-15 Thread Zach the Mystic via Digitalmars-d

On Saturday, 14 March 2015 at 15:55:51 UTC, Marc Schütz wrote:
Are the suggested changes also related to the possibility of 
making `ref` a type?


Are there plans to do this? I remember Walter suggested `ref` 
for non-parameters, i.e. local variables, but as a storage 
class, not a type modifier.


I don't think there are plans per se, but if struct semantics are 
made powerful and flexible enough, I can imagine it being 
possible to simply recreate 'ref' parameters as 'Ref!' struct 
templates. For me, the question is what new additions would have 
to be added to structs to enable this. It seems like a good 
thought exercise, regardless of the final decision.


Re: The next iteration of scope

2015-03-15 Thread Zach the Mystic via Digitalmars-d

On Sunday, 15 March 2015 at 14:10:02 UTC, Marc Schütz wrote:

Here's the new version of my scope proposal:
http://wiki.dlang.org/User:Schuetzm/scope2

It's still missing real-life examples, a section on the 
implementation, and a more formal specification, as well as a 
discussion of backwards compatibility. But I thought I'd show 
what I have, so that it can be discussed early on.


I hope it will be more digestible for Walter  Andrei. It's 
more or less an extended version of DIP25, and avoids the need 
for most explicit annotations.


It's great to see your design evolving like this. BIG plus for 
`scope` by default in @safe code -- this makes the proposal much 
more attractive than the alternative.


Functions and methods can be overloaded on scope. This allows 
efficient passing of RC wrappers for instance...


How does the compiler figure out which of the variables it's 
passing to the parameters are `scope` or not? Does the caller try 
the scoped overloads first by default, and only if there's an 
error tries the non-scoped overloads? If so, what causes the 
error?


To specify that the value is returned through another parameter, 
the return!ident syntax can be used...


struct RC(T) if(is(T == class)) {
scope T payload;
T borrow() return {// `return` applies to `this`
return payload;
}
}

The example contains no use of `return!ident`.

Also, what exactly does the `scope` on T payload get you? Is it 
just a more specific version of `return` on the this parameter, 
i.e. `return this.payload`? Why would you need that specificity? 
What is the dangerous operation it is intended to prevent?


Re: Smart references

2015-03-13 Thread Zach the Mystic via Digitalmars-d
On Wednesday, 11 March 2015 at 20:33:07 UTC, Andrei Alexandrescu 
wrote:
I'm investigating D's ability to define and use smart 
references. Per the skeleton at 
http://dpaste.dzfl.pl/9d752b1e9b4e, lines:


#6: You can't default-initialize a ref.

#7: You can't copy a ref - copying should mean copying the 
object itself.


#9: Per this example I'm hooking a reference with an Owner. The 
reference hooks calls to opAddRef and opRelease in the owner.


#23: Assigning the reference really assigns the referred.

#28: A reference is a subtype of ref T. Most operations against 
the reference will be automatically forwarded to the underlying 
object, by reference (ref is important here).


As unittests show, things work quite nicely. There are a few 
things that don't:


#70: Attempting to copy a reference fails on account of the 
disabled postblit. There should be a way to tell the compiler 
to automatically invoke alias this and create a copy of that 
guy.


#81: Moving from a reference works by moving the Ref object. 
There should be a way to tell the compiler that moving should 
really move the payload around.


There are a couple other issues not represented in the 
unittest, for example related to template deduction. In a 
perfect world, Ref would masquerade (aside from having a 
different layout, ctor, and dtor) as an lvalue of type T.


But in fact I think solving the matters above would go a long 
way toward making smart references nicely usable. Although my 
example is centered on reference counting an owner, there are 
other uses of smart references. Are all these worth changing 
the language?


Are the suggested changes also related to the possibility of 
making `ref` a type?


I have no opposition in principle to expanding struct semantics 
to be as transparent as possible. But then I ask, what prevented 
them from being expanded until now? From reading these forums, 
I've learned that C++ reference types have a lot of problems. 
Does expanding the semantics of structs run the risk of 
encountering same sorts of problems C++ references have?


The ideal is to find a way to add semantics without adding 
ambiguity (i.e. to make sure both the compiler and the programmer 
always choose the right interpretation of a given construct). So, 
for example, if you pass a `Ref!X` type to a type `X` parameter, 
or you pass an `X` type to a `Ref!X` parameter, the result is 
easy for both the compiler and the human to figure out. That's 
all I've got.


Re: Two suggestions for safe refcounting

2015-03-06 Thread Zach the Mystic via Digitalmars-d

On Friday, 6 March 2015 at 14:40:31 UTC, Volodymyr wrote:

On Friday, 6 March 2015 at 07:46:13 UTC, Zach the Mystic wrote:
...
Note how the last member, opIndex, doesn't return a raw E*, 
but only an E* which is paired with a pointer to the same 
RCData instance as the RCArray is:


struct RCElement(E) {
 E* element;
 private RCData* data;

 this(this) {
   data.addRef();
 }
 ~this() {
   data.decRef();
 }
}

This is the best I could do.


It's needed to change type of this from RCArray to 
tuple!(RCArray, RCData). But as for me better to use Array and 
cahnge typeof(this) to RefCounter!Array:

assert(typeid(typeof(this)) == typeid(RefCounter!Array));

So how to deal with it:
struct RefCounter(T) // this is struct!
{
void opAddRef();
void opRelease();
alias this = __data;
void[] allocate(size_t)

// Hendler for sharing owned resources
auto opShareRes(MemberType)(ref MemberType field)
{
return makeRefCounter(field, __count);
}

private:
size_t __count;
T __data;
}

@resource_owner(RefCounter)
class Array
{
ref int opIndex(size_t i) return
{
return _data[i];
}

 opIndex will be replaced with this function
//RefCounter!int opIndex(size_t i) // @return?
//{
//assert(typeid(this) == typeid(RefCounter!Array));
//return this.opShareResource(_data[i]);
//// after inlining: return makeRefCounter(_data[i], 
__count);

//}

private int[] _data;
}

Method opShareRes is to move resources away(share with other 
owner) and an @return method will change its return type to 
opSharedRes return type. opShareRes also wraps access to public 
fields(and may change type of result).


Now Array is actualy alias to RefCounter!Array. Array creation 
is special case. new Array have to use 
RefCounter!Array.allocate. So owner manage array parts sharing, 
allocation and removing.


Options for @resource_owner
@resource_owner(this) - class provides 
opAddRef/opRelease/opShareRes by itself as in DIP74
@resource_owner(this, MyRCMixin) - MyRCMixin provides 
opAddRef/opRelease/opShareRes and will be mixed in class.(What 
DIP74 has in mind)
@resource_owner(Owner) - Owner is a template. Whenever you use 
owned type T it will be replaced with Owner!T(even type of 
this). This case prohibits changing owning strategy.


You've packed a lot of ideas into one post. Your solution might 
work, but it's hard for me to tell.


Resourse owning is close to memory management. Maybe resource 
owner have to set memory allocation strategy instead of 
providing method allocate.


This is an open question. I'm still wrestling with understanding 
all the interlocking systems. The only reason I keep exploring 
them is that sometimes it seems like nobody else understands them 
either. ^_^


Re: Two suggestions for safe refcounting

2015-03-06 Thread Zach the Mystic via Digitalmars-d

On Friday, 6 March 2015 at 14:59:46 UTC, monarch_dodra wrote:

struct RCArray(E) {
 E[] array;
 int* count;
 ...
}
auto x =  RCArray([E()]);
E* t = x[0];


But taking that address is unsafe to begin with. Do arguably, 
this isn't that big of a problem.


Taking the address is only really unsafe (in a non-RC'd type) if 
you don't have a lifetime tracking system. As long as the 
lifetime of the address taker is shorter than the address of the 
takee, it's not inherently unsafe. Whether D will end up with 
such a system is a different question.


But I still think there's value in having a separate RCData type, 
because you can save one pointer per instance of RCArray. Right 
now, if you take a slice of an RCArray, your working array might 
not start at the same place as the reserved memory array. 
Therefore you need to keep a pointer to the reserved memory in 
addition to your active working array. If the counter and the 
pointer to the original memory are in the same place, one pointer 
will get you both.


I think the idea is worth exploring.

Your first dual reference issue seems much more problematic, as 
there are always cases the compiler can't catch.


How so? If all we're talking about is RC'd types, the compiler 
can catch everything. I think the greater concern is that the 
workarounds will take a toll in runtime performance. I'll try to 
illustrate:


void fun(ref RCStruct a, ref RCStruct b);

auto x = new RCStruct;
fun(x, x);

This wouldn't be safe. If fun() contained a line a = new 
RCStruct;, b will point to deleted memory for the rest of the 
function. The normal way to protect this to make sure there's 
another reference:


auto y = x;
fun(x,x);

This is actually safe, because y bumps the reference counter to 2 
when initialized, which is enough to cover all possible 
reassignments of x. The compiler could do this automatically. It 
could detect that the parameter x aliases itself and create a 
temporary copy of x. But it would mean the runtime performance 
cost of the copy and postblit and destructor call. So D probably 
can't invest in that strategy, since the programmer should have a 
choice about it.


So it's not about it being impossible to deal with the safety 
problems here, just that the runtime cost is too high.


But there are some ways out. If the given type has no postblit, 
for example (or opAddRef for  classes), there's no reason to 
mark the operation unsafe, since you know it's not reference 
counted. Also, const parameters are safe and won't be affected.


Re: RCArray is unsafe

2015-03-05 Thread Zach the Mystic via Digitalmars-d

On Thursday, 5 March 2015 at 18:41:31 UTC, deadalnix wrote:
Kind of OT, but your train of thought is very difficult to 
follow the way you are communicating (ie by updating on 
previous post by answering to yourself).


Could you post some more global overview at some point, so one 
does not need to gather various information for various posts 
please ?


Okay. I seem to be mixing my more well-thought out ideas with 
ideas I get on the spur of the moment. Then they come out in a 
jumble. I have to confess that a lot of my ideas just pop into my 
head.


Did you want me to talk about how I would do ownership with my 
reference safety system?


Re: RCArray is unsafe

2015-03-05 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 4 March 2015 at 18:05:52 UTC, Zach the Mystic wrote:
On Wednesday, 4 March 2015 at 17:22:15 UTC, Steven 
Schveighoffer wrote:
Again, I think this is an issue with the expectation of 
RCArray. You cannot *save* a ref to an array element, only a 
ref to the array itself, because you lose control over the 
reference count.


What you need is a special RCSlave type, which is reference 
counted not to the type of its *own* data, but to its parent's. 
In this case, a RCArraySlave!(T) holds data of type T, but a 
pointer to an RCArray, which it decrements when it gets 
destroyed. This could get expensive, with an extra pointer per 
instance than a regular T, but it would probably be safe.


A way to do this is to have a core RCData type which has the 
count itself and the chunk of memory the count refers to in type 
ambiguous form:


struct RCData {
  int count;
  // the point is that RCData can be type ambiguous
  void[] chunk;

  this(size_t size)
  {
chunk = new void[size];
count = 0;
  }
  void addRef() {
++count;
  }
  void decRef() {
if (count  --count == 0)
  delete chunk;
  }
}

Over top of that you create a basic element type which refcounts 
an RCData rather than itself:


struct RCType(E) {
  E element;
  RCType* data;

  this(this)
  {
data.addRef();
  }

  ~this()
  {
data.decRef();
  }
  [...etc...]
}

Then you have an RCArray which returns RCType elements when 
indexed rather than naked types:


struct RCArray(E) {

  E[] array;
  private RCData* data;

  RCElement!E opIndex(size_t i) return
  {
return RCElement!E(array[start + i], data);
  }

  this(E[] a)
  {
data = new RCData(a * sizeof(a));
array = cast(E[]) data.chunk;

  }

  this(this)
  {
data.addRef();
  }

  ~this()
  {
data.decRef();
  }

  //...
}

This might work. The idea is to only leak references to types 
which also have pointers to the original data.


Two suggestions for safe refcounting

2015-03-05 Thread Zach the Mystic via Digitalmars-d
As per deadalnix's request, a summary of my thoughts regarding 
the thread RCArray is unsafe:


It's rather easy to guarantee memory safety from the safe 
confines of a garbage collected system. Let's take this as a 
given.


It's much harder when you step outside that system and try to 
figure out when it is or isn't safe to delete memory. It 
shouldn't be too surprising, therefore, that there are lots of 
pitfalls. Reference counting is a lonely outpost in the 
wilderness which is otherwise occupied by manual memory 
management. It's the only alternative to chaos.


But the walls protecting this outpost are easily breached by any 
dangling reference which is not accounted for.


We have seen two instances of how this can occur. The first, when 
boiled down to its essence, is that there is no corresponding 
bump in the reference count for a parameter which can alias an 
existing reference:


void fun(ref RCStruct a, ref RCStruct b);
RCStruct c;
fun(c,c); // c aliases itself

void gun(ref RCStruct a);
static RCStruct d;
gun(d); // d aliases global d

Because the workarounds are easy:
{
  RCStruct c;
  auto tmp = c;
  fun(c,tmp);

  auto tmp2 = d;
  gun(tmp2);
}
...it seems okay to mark these rare violations @system.

The second, harder problem, is when you take a reference to a 
subcomponent of an RC'd type, e.g. an individual E of an RCArray 
of E:


struct RCArray(E) {
  E[] array;
  int* count;
  ...
}
auto x =  RCArray([E()]);
E* t = x[0];

Here's the problem. If x is assigned to a different RCArray, the 
one t points to will be deleted. On the other hand, if some 
special logic allows the definition of t to increment the 
reference count, then you have a memory leak, because t is not 
designed to keep track of x's original counter.


I don't know if we can get out of this mess. My suggestion 
represents a best-effort attempt. The only way I can see out of 
this problem is to redesign RCArray.


The problem with RCArray is that it owns the data it 
references. If a type different from RCArray, i.e. an individual 
E* into the array of E[], tries to reference the data, it's 
stuck, because it's not an RCArray!E. Therefore, you need to 
separate out the core data from the different types that can 
point to it. The natural place would be right next to its 
reference counter, in a separate struct:


struct RCData {
  int count = 0;
  void[] chunk;

  this(size_t size) {
chunk = new void[size];
  }
  void addRef() {
++count;
  }
  void decRef() {
if (--count == 0)
  delete chunk;
  }
}

Now RCArray can be redesigned to point to an RCData type. All new 
RC types will also contain a pointer to an RCData instance:


struct RCArray(E) {
  E[] array;
  private RCData* data;

  this(E[] a) {
data = new RCData(a * sizeof(a));
data.chunk = cast(void[]) a;
array = a;
  }

  this(this) {
data.addRef();
  }

  ~this() {
data.decRef();
  }

  ref RCElement!E opIndex(size_t i) return {
return RCElement!E(array[i], data);
  }
  ...
}

Note how the last member, opIndex, doesn't return a raw E*, but 
only an E* which is paired with a pointer to the same RCData 
instance as the RCArray is:


struct RCElement(E) {
  E* element;
  private RCData* data;

  this(this) {
data.addRef();
  }
  ~this() {
data.decRef();
  }
}

This is the best I could do.


Re: RCArray is unsafe

2015-03-04 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 4 March 2015 at 17:13:13 UTC, Zach the Mystic wrote:
(Also, `pure` functions will need no `static` parameter 
attributes, and functions both `pure` and `@nogc` will not need 
)


...will not need `@noscope` either.


Re: RCArray is unsafe

2015-03-04 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 4 March 2015 at 09:06:01 UTC, Walter Bright wrote:

On 3/4/2015 12:13 AM, deadalnix wrote:
The #1 argument for DIP25 compared to alternative proposal was 
its simplicity. I
assume at this point that we have empirical evidence that this 
is NOT the case.


The complexity of a free list doesn't remotely compare to that 
of adding an ownership system.


My reference safety system has ownership built in, more-or-less 
for free:


http://forum.dlang.org/post/offurllmuxjewizxe...@forum.dlang.org

See also my reply to deadalnix:

http://forum.dlang.org/post/oyaoibmwybzfkhhuf...@forum.dlang.org


Re: RCArray is unsafe

2015-03-04 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 4 March 2015 at 07:50:50 UTC, Manu wrote:

Well you can't get to a subcomponent if not through it's owner.
If the question is about passing RC objects members to 
functions, then
the solution is the same as above, the stack needs a reference 
to the
parent before it can pass a pointer to it's member down the 
line for

the same reasons.


Yeah, or you could mimic such a reference by wrapping the call in 
an addRef/release cycle, as a performance optimization.


The trouble then is what if that member pointer escapes? Well 
I'd
imagine that it needs to be a scope pointer (I think we all 
agree RC
relies on scope). So a raw pointer to some member of an RC 
object must

be scope(*).


I have a whole Reference Safety System which doesn't need 
explicit scope because it incorporates it implicitly:


http://forum.dlang.org/post/offurllmuxjewizxe...@forum.dlang.org


That it can't escape, combined with knowledge that the
stack has a reference to it's owner, guarantees that it won't
disappear.


I think you and I are on the same page.


Re: RCArray is unsafe

2015-03-04 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 4 March 2015 at 08:13:33 UTC, deadalnix wrote:
On Wednesday, 4 March 2015 at 03:46:36 UTC, Zach the Mystic 
wrote:
That's fine. I like DIP25. It's a start towards stronger 
safety guarantees. While I'm pretty sure the runtime costs of 
my proposal are lower than yours, they do require compiler 
hacking, which means they can wait.


I don't think that it is fine.

At this point we need to :
 - Not free anything as long as something is alive.
 - Can't recycle memory.
 - Keep track of allocated chunk to be able to free them (ie 
implementing malloc on top of malloc).


Well, I don't want to make any enemies. I thought that once the 
compiler was hacked people could just change their 
deferred-freeing code.


Re: RCArray is unsafe

2015-03-04 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 4 March 2015 at 08:13:33 UTC, deadalnix wrote:
On Wednesday, 4 March 2015 at 03:46:36 UTC, Zach the Mystic 
wrote:
That's fine. I like DIP25. It's a start towards stronger 
safety guarantees. While I'm pretty sure the runtime costs of 
my proposal are lower than yours, they do require compiler 
hacking, which means they can wait.


I don't think that it is fine.

At this point we need to :
 - Not free anything as long as something is alive.
 - Can't recycle memory.
 - Keep track of allocated chunk to be able to free them (ie 
implementing malloc on top of malloc).


It means that RC is attached to an ever growing arena. Code 
that would manipulate RCArray and append to it on a regular 
manner must expect some impressive memory consumption.


Even if we manage to do this in phobos (I'm sure we can) it is 
pretty much guaranteed at this point that noone else will, at 
least safely. The benefit is reduced because of the bookeeping 
that need to be done for memory to be freed in addition to 
reference count themselves.


The #1 argument for DIP25 compared to alternative proposal was 
its simplicity. I assume at this point that we have empirical 
evidence that this is NOT the case.


To me, DIP25 is just the first step towards an ownership system. 
The only language additions you need to it are out! parameters, 
to track escapes to other parameters, static parameters 
(previously called noscope), to say that the parameter won't be 
copied to a global, and one more function attribute (for which I 
can reuse noscope as @noscope) which says the return value will 
nto be allocated on the heap. All of these will be rare, as they 
aim to target the exceptional cases rather than the norms 
(scope would be the norm. Hence @noscope to target the rare 
cases):


Examples:

T* fun(return T* a, T* b, T**c);

This signature would indicate complete ownership transferred from 
`a` to the return value, since only `a` can be returned (see why 
below)


T* gun(return out!b T* a, T** b);

`a` is declared to be copied both to the return value and to `b`. 
Therefore it is not owned. (If you're following my previous 
definition of `out!` in DIP71, you'll notice I moved `out!` to 
the source parameter rather than the target, but the point is the 
same.)


T* hun(return T* a) @noscope {
  if(something)
return a;
  else return new T;
}

Again, no ownership. If you *might* return a heap or global, the 
function must be marked @noscope (Again I've readapted the word 
to a new meaning from dIP71. I'm using `static` now for 
`noscope's original meaning.)


Another example:

T* jun(return static T* a) {
  static T* t;
  t = a;
  return a;
}

Again, no ownership, because of the `static` parameter attribute. 
In a previous post, you suggested that such an attribute was 
unnecessary, but an ownership system would require that a given 
parameter `a` which was returned, not also be copied to a global 
at the same time. So `static` tells the compiler this, and thus 
cancels ownership.


My point is that DIP25's `return` parameters are the beginning of 
an ownership system. An option to specify that the function 
*will* return a given `return` parameter as opposed to *might* 
return it is the only thing needed. Hence the additions named 
above. (Also, `pure` functions will need no `static` parameter 
attributes, and functions both `pure` and `@nogc` will not need )


With the exception of some minor cosmetic changes, all this is 
in, or at least hinted at, in my previously posted Reference 
Safety System:


http://forum.dlang.org/post/offurllmuxjewizxe...@forum.dlang.org

The only thing which bears reiterating is that with better 
attribute inference, the whole system becomes invisible for most 
uses.


Re: RCArray is unsafe

2015-03-04 Thread Zach the Mystic via Digitalmars-d
On Wednesday, 4 March 2015 at 17:22:15 UTC, Steven Schveighoffer 
wrote:
Again, I think this is an issue with the expectation of 
RCArray. You cannot *save* a ref to an array element, only a 
ref to the array itself, because you lose control over the 
reference count.


What you need is a special RCSlave type, which is reference 
counted not to the type of its *own* data, but to its parent's. 
In this case, a RCArraySlave!(T) holds data of type T, but a 
pointer to an RCArray, which it decrements when it gets 
destroyed. This could get expensive, with an extra pointer per 
instance than a regular T, but it would probably be safe.


Re: RCArray is unsafe

2015-03-04 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 4 March 2015 at 18:05:52 UTC, Zach the Mystic wrote:
On Wednesday, 4 March 2015 at 17:22:15 UTC, Steven 
Schveighoffer wrote:
Again, I think this is an issue with the expectation of 
RCArray. You cannot *save* a ref to an array element, only a 
ref to the array itself, because you lose control over the 
reference count.


What you need is a special RCSlave type, which is reference 
counted not to the type of its *own* data, but to its parent's. 
In this case, a RCArraySlave!(T) holds data of type T, but a 
pointer to an RCArray, which it decrements when it gets 
destroyed. This could get expensive, with an extra pointer per 
instance than a regular T, but it would probably be safe.


Another solution is to get compiler help. If you know the 
lifetime of a sub-reference `p.t` to be shorter than of its Rc'd 
parent `p`, the compiler can wrap its `p.t's lifetime in an 
addRef/release cycle for P. This works in calling a function:


fun(p, p.t);

Let's say that you know that `p.t` won't escape (a different 
question). The compiler doesn't need to know about `p.t` to wrap 
the whole function like this:


p.opAddRef(); // or equivalent
fun(p, p.t);
p.opRelease();

It just needs to know that `p.t's lifetime is shorter than `p's.


Re: RCArray is unsafe

2015-03-04 Thread Zach the Mystic via Digitalmars-d
On Wednesday, 4 March 2015 at 18:17:41 UTC, Andrei Alexandrescu 
wrote:
Yah, this is a fork in the road: either we solve this with 
DIP25 + implementation, or we add stricter static checking 
disallowing two lent references to data in the same scope.


The third solution is to keep track of lifetimes, recognize 
refcounted types for structs the same as suggested for classes in 
DIP74, and wrap the lifetime of the subreference `t.s` in an 
opAdd/Release cycle for `t`, as illustrated in my other reply. 
You could have the compiler recognize a refcounted struct by 
simply declaring void opAddRef(); and void opRelease();, with 
the compiler automatically aliasing them to this(this) and 
~this.


Re: RCArray is unsafe

2015-03-04 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 4 March 2015 at 17:13:13 UTC, Zach the Mystic wrote:

Another example:

T* jun(return static T* a) {
  static T* t;
  t = a;
  return a;
}

Again, no ownership, because of the `static` parameter 
attribute. In a previous post, you suggested that such an 
attribute was unnecessary, but an ownership system would 
require that a given parameter `a` which was returned, not also 
be copied to a global at the same time. So `static` tells the 
compiler this, and thus cancels ownership.


Actually, I think you convinced me before that `static` (or 
`noscope`) parameters wouldn't carry their weight. Instead, 
copying a parameter reference to a global variable is unsafe by 
default. Wrap it in a `@trusted` lambda if you know what you're 
doing. (Trusted lambdas are assumed to copy no reference 
parameters.) In this way, you can assume ownership. Any unsafe 
global escapes are just ignored. ???


Re: RCArray is unsafe

2015-03-04 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 4 March 2015 at 19:22:25 UTC, Zach the Mystic wrote:
On Wednesday, 4 March 2015 at 18:17:41 UTC, Andrei Alexandrescu 
wrote:
Yah, this is a fork in the road: either we solve this with 
DIP25 + implementation, or we add stricter static checking 
disallowing two lent references to data in the same scope.


The third solution is to keep track of lifetimes, recognize 
refcounted types for structs the same as suggested for classes 
in DIP74, and wrap the lifetime of the subreference `t.s` in an 
opAdd/Release cycle for `t`, as illustrated in my other reply. 
You could have the compiler recognize a refcounted struct by 
simply declaring void opAddRef(); and void opRelease();, 
with the compiler automatically aliasing them to this(this) 
and ~this.


I'm sorry, I just realized this proposal is too complicated, and 
it wouldn't even work.


I think stricter static checking in @safe code is the way to go. 
When passing a global RC type to an impure, or duplicating the 
same RC reference variable in a function call, it's unsafe. The 
workaround is to make copies and use them:


static RcType s; // global
RcType c;

// Instead of:
func(s);
func(c, c);

// ...do this:
auto tmp = s; // get stack reference
func(tmp);
auto d = c; // copy Rc'd type
func(c, d);

Expensive, perhaps, but safe.


Re: RCArray is unsafe

2015-03-03 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 3 March 2015 at 05:12:15 UTC, Walter Bright wrote:

On 3/2/2015 6:04 PM, weaselcat wrote:

On Tuesday, 3 March 2015 at 01:56:09 UTC, Walter Bright wrote:

On 3/2/2015 4:40 PM, deadalnix wrote:
After moving resources, the previous owner can no longer be 
used.


How does that work with the example presented by Marc?


He couldn't pass s and a member of s because s is borrowed as 
mutable.

He would have to pass both as immutable.


A pointer to s could be obtained otherwise and passed.


Under normal circumstances, if the pointer to s is an lvalue, the 
refcount will be bumped when it is taken.


Isn't the only problem now aliasing something (i.e. a global) 
invisibly through a parameter? This is easily solved -- when 
passing a global reference, or duplicating a variable in the same 
call, wrap the call in an add/release cycle. This preserves the 
alias for the duration of the call.


Or are we also talking about taking the address of a non-rc'd 
subcomponent of an rc'd struct?


Re: RCArray is unsafe

2015-03-03 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 3 March 2015 at 08:04:25 UTC, Manu wrote:

My immediate impression on this problem:

s.array[0] is being passed to foo from main. s does not belong 
to main

(is global), and main does not hold have a reference to s.array.
Shouldn't main just need to inc/dec array around the call to 
foo when

passing un-owned references down the call tree.
It seems to me that there always needs to be a reference 
_somewhere_
on the stack for anything being passed down the call tree 
(unless the
function is pure). Seems simplest to capture a stack ref at the 
top

level, then as it's received as arguments to each callee, it's
effectively owned by those functions and they don't need to 
worry

anymore.

So, passing global x to some function; inc/dec x around the 
function
call that it's passed to...? Then the stack has its own 
reference, and

the global reference can go away safely.


This is my position too.

There is another problem being discussed now, however, having to 
do with references to non-rc'd subcomponents of an Rc'd type.


Re: RCArray is unsafe

2015-03-03 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 22:58:19 UTC, Walter Bright wrote:
Pretty dazz idea, dontcha think? And DIP25 still stands 
unscathed :-)


Unless, of course, we missed something obvious.


I was dazzed, but I'm not anymore. I wrote my concern here:

http://forum.dlang.org/post/ylpaqhnuiczfgfpqj...@forum.dlang.org


Re: RCArray is unsafe

2015-03-03 Thread Zach the Mystic via Digitalmars-d
On Tuesday, 3 March 2015 at 16:31:07 UTC, Andrei Alexandrescu 
wrote:

I was dazzed, but I'm not anymore. I wrote my concern here:

http://forum.dlang.org/post/ylpaqhnuiczfgfpqj...@forum.dlang.org


There's a misunderstanding here. The object being assigned 
keeps a trailing list of past values and defers their 
deallocation to destruction. -- Andrei


So you need an extra pointer per instance? Isn't that a big price 
to pay? Is the only problem we're still trying to solve aliasing 
which is not recognized as such and therefore doesn't bump the 
refcounter like it should? An extra pointer would be overkill for 
that. Isn't it better to just recognize the aliasing when it 
happens?


As far as taking the address of an RcArray element, the type of 
which element is not itself Rc'ed, it's a different problem. The 
only thing I've been able to come up with is maybe to create a 
wrapper type within RcArray for the individual elements, and have 
that type do refcounting on the parent instead of itself, if 
that's possible.


Re: RCArray is unsafe

2015-03-03 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 3 March 2015 at 17:40:59 UTC, Marc Schütz wrote:
All instances need to carry a pointer to refcount anyway, so 
the freelist could just be stored next to the refcount. The 
idea of creating that list, however, is more worrying, because 
it again involves allocations. It can get arbitrarily long.


If the last RcType is a global, will the list ever get freed at 
all?


No, Andrei's proposed solution would take care of that. On 
assignment to RCArray, if the refcount goes to zero, the old 
array is put onto the cleanup list. But there can still be 
borrowed references to it's elements. However, these can never 
outlive the RCArray, therefore it's safe to destroy all of the 
arrays in the cleanup list in the destructor.


Wouldn't you need a lifetime system for this? A global, for 
example, couldn't borrow safely. I'm all in favor of an 
ownership/borrowing system, but that would be for a different 
DIP, right? It seems like taking the address of a sub-element of 
an RcType is inherently unsafe, since it separates the memory 
from the refcount.


Re: RCArray is unsafe

2015-03-03 Thread Zach the Mystic via Digitalmars-d
On Tuesday, 3 March 2015 at 18:48:36 UTC, Andrei Alexandrescu 
wrote:

On 3/3/15 9:00 AM, Zach the Mystic wrote:
On Tuesday, 3 March 2015 at 16:31:07 UTC, Andrei Alexandrescu 
wrote:

I was dazzed, but I'm not anymore. I wrote my concern here:

http://forum.dlang.org/post/ylpaqhnuiczfgfpqj...@forum.dlang.org


There's a misunderstanding here. The object being assigned 
keeps a

trailing list of past values and defers their deallocation to
destruction. -- Andrei


So you need an extra pointer per instance?


Yah, or define your type to be single-assignment (probably an 
emerging idiom). You can massage the extra pointer with other 
data thus reducing its cost.



Isn't that a big price to
pay? Is the only problem we're still trying to solve aliasing 
which is
not recognized as such and therefore doesn't bump the 
refcounter like it
should? An extra pointer would be overkill for that. Isn't it 
better to

just recognize the aliasing when it happens?


It's all tradeoffs. This has runtime overhead.


Isn't allocating and collecting a freelist also overhead?

A static analysis would have the challenges of being permissive 
enough, cheap enough, not add notational overhead, etc. etc.


It's certainly permissive: you can do anything, and compiler 
wraps uncertain operations with add/release cycles automatically. 
These are: passing a global as a mutable reference to an impure 
function; aliasing the same variable in two parameters with 
itself. The unoptimized lowerings would be:


{
  auto tmp = myGlobal; // bump count
  impureFun(myGlobal);
}  // tmp destroyed, --count

{
  auto tmp2 = c; // bump count
  fun(c, c);
} // --count

The only addition is an optimization where the compiler elides 
the assignments and calls the add/release cycles directly.


Re: RCArray is unsafe

2015-03-03 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 4 March 2015 at 03:46:36 UTC, Zach the Mystic wrote:
Just my own past posts. My suggestion is based on the compiler 
doing all the work. I don't know how it could be tested without 
hacking the compiler.


I think that part of the fear of my idea is that I want structs 
to get some of the behavior suggested in DIP74 for classes, i.e. 
the compiler inserts calls to opAddRef/opRelease on its own at 
certain times. Since structs only have postblits and destructors, 
there's no canonical way to call them as separate functions. The 
behavior I'm suggesting would only be good if you had a 
refcounted type, which means it's superfluous if not harmful to 
insert it just because in other types of structs.


If it turns out that some of the behavior desirable for 
refcounted classes is useful for structs too, it may be necessary 
to hint to the complier that a struct is indeed of the refcounted 
type. For example, void opAddRef(); and void opRelease(); 
could be specially recognized, with no definitions even permitted 
(error on attempt), implying alias opAddRef this(this);, alias 
opRelease ~this;.


Re: RCArray is unsafe

2015-03-03 Thread Zach the Mystic via Digitalmars-d
On Tuesday, 3 March 2015 at 21:37:20 UTC, Andrei Alexandrescu 
wrote:

On 3/3/15 12:35 PM, Zach the Mystic wrote:

Isn't allocating and collecting a freelist also overhead?


No. I don't have time now for a proof of concept and it seems 
everybody wants to hypothesize about code that doesn't exist 
instead of writing code and then discussing it.


Okay.


The unoptimized lowerings would be:

{
  auto tmp = myGlobal; // bump count
  impureFun(myGlobal);
}  // tmp destroyed, --count

{
  auto tmp2 = c; // bump count
  fun(c, c);
} // --count

The only addition is an optimization where the compiler elides 
the

assignments and calls the add/release cycles directly.


Do you have something reviewable, or just your own past posts?


Just my own past posts. My suggestion is based on the compiler 
doing all the work. I don't know how it could be tested without 
hacking the compiler.


For the time being I want to move forward with DIP25 and 
deferred freeing.


That's fine. I like DIP25. It's a start towards stronger safety 
guarantees. While I'm pretty sure the runtime costs of my 
proposal are lower than yours, they do require compiler hacking, 
which means they can wait.


Re: My Reference Safety System (DIP???)

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 22:00:56 UTC, deadalnix wrote:
You don't put the ownership acquire at the same place, but that 
is the same idea. It is probably even better to do it your way 
(or is it ?).


Yes. Unless the compiler detects that you duplicate a variable in 
two parameters in the same call, you literally have *no* added 
cycles, anywhere:


fun(c, c.c);

This is the only time you pay any penalty (except for passing 
globals, as we now realize, since all globals can alias 
themselves as parameters -- nasty).


Re: RCArray is unsafe

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 20:54:20 UTC, Walter Bright wrote:

On 3/2/2015 12:42 PM, Walter Bright wrote:
For D structs, that means, if there's a postblit, a copy must 
be made. For D ref counted classes, a ref count increment must 
be done.


I was hoping to avoid that, but apparently there's no way.

There are cases where this can be avoided, like calling pure 
functions. Another win for pure functions!


It seems like the most common use case for passing a global to a 
parameter is to process that global in a pure way. You already 
have access to it in an impure function, so you could just access 
it directly. The good news is that from within the passed-to 
function, no further calls will treat it as global.




Re: My Reference Safety System (DIP???)

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 20:04:49 UTC, deadalnix wrote:
I let the night go over that one. Here is what I think is the 
best road forward :
 - triggering postblit and/or ref count bump/decrease is 
prohibited on borrowed.

 - Acquiring and releasing ownership does.

Now that we have this, let's get back to the exemple :
class C {
C c;

// Make ti refconted somehow, doesn't matter. Andrei's 
proposal for instance.

}

void boom() {
C c = new C();
c.c = new C();

foo(c, c.c);
}

void foo(ref C c1, ref C c2) {
// Here is where things get different. c1 is borrowed, so 
you can't

// do c1.c = null before acquiring c1.c beforehand.


Right, I agree with this.


That means the
// compiler needs to get a local copy of c1.c, bump the 
refcount
// to get ownership before executing c1.c = null and 
decrease

// the refcount.


Yeah, but should it do this inside foo() or in bump() right 
before it calls foo. I think in bump, and only for a parameter 
which might be aliased by another parameter (an extremely rare 
case). For any other case, the refcount has already been 
preserved:


void boom() {
C c = new C(); // refcount(c) == 1
c.c = new C(); // refcount(c.c) == 1
auto d = c.c; // refcount(c.c) == 2 now
foo(c, d); // safe
}

The only problem is the rare case when the exact same identifier 
is getting sent to two different parameters.


I'm sure there will be opportunities to elide a lot of refcount 
calls, but in this case,  I don't see much to left to elide.


Re: My Reference Safety System (DIP???)

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 3 March 2015 at 00:02:48 UTC, deadalnix wrote:
What do you think? How many times do you normally pass a 
global?


I fail too see how t being global vs t being a local that is 
doubly passed change anything.


Within the function, the global passed as a parameter creates an 
alias to the global. Fortunately, Andrei Fermat may have just 
solved the issue:


http://forum.dlang.org/post/md2pub$nqn$1...@digitalmars.com


Re: RCArray is unsafe

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 20:37:46 UTC, Walter Bright wrote:

On 3/1/2015 12:51 PM, Michel Fortin wrote:
That's actually not enough. You'll have to block access to 
global variables too:


S s;

void main() {
s.array = RCArray!T([T()]);   // s.array's refcount is 
now 1

foo(s.array[0]);   // pass by ref
}
void foo(ref T t) {
s.array = RCArray!T([]);  // drop the old s.array
t.doSomething();  // oops, t is gone
}


So with Andrei's solution, will s.array ever get freed, since s 
is a global? I guess it *should* never get freed, since s is a 
global and it will always exist as a reference.


Which makes me think about a bigger problem... when you opAssign, 
don't you redirect the variable to a different instance? Won't 
the destructor then destroy *that* instance (or not destroy it, 
since it just got a +1 count) instead of the one most recently 
decremented? How does it hold onto the instance to be destroyed?


Re: My Reference Safety System (DIP???)

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 22:51:29 UTC, deadalnix wrote:

On Monday, 2 March 2015 at 22:21:11 UTC, Zach the Mystic wrote:

On Monday, 2 March 2015 at 22:00:56 UTC, deadalnix wrote:
You don't put the ownership acquire at the same place, but 
that is the same idea. It is probably even better to do it 
your way (or is it ?).


Yes. Unless the compiler detects that you duplicate a variable 
in two parameters in the same call, you literally have *no* 
added cycles, anywhere:


fun(c, c.c);

This is the only time you pay any penalty (except for passing 
globals, as we now realize, since all globals can alias 
themselves as parameters -- nasty).


Global simply are parameter implicitly passed to all function 
from a theoretical perspective. There are no reason to thread 
them differently.


Except for this:

static Rctype t; //

fun(t);

Now you have that implicit parameter which screws things up. It's 
like calling:


fun(@globals, t);

...where @globals is a namespace which can alias t. So you have 
two parameters which can alias each other. I think the only 
saving grace is that you probably don't really need to pass a 
global that often, since you already have it if you want it. Only 
if you want the global to play the role of a parameter.


What do you think? How many times do you normally pass a global?


Re: RCArray is unsafe

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 3 March 2015 at 01:23:24 UTC, Zach the Mystic wrote:
Which makes me think about a bigger problem... when you 
opAssign, don't you redirect the variable to a different 
instance? Won't the destructor then destroy *that* instance (or 
not destroy it, since it just got a +1 count) instead of the 
one most recently decremented? How does it hold onto the 
instance to be destroyed?


auto y = new RcStruct;
y = null;

y's old RcStruct gets decremented to zero, but who owns it now? 
Whose destructor ever gets run on it?


Re: RCArray is unsafe

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 22:58:19 UTC, Walter Bright wrote:
His insight was that the deletion of the payload occurred 
before the end of the lifetime of the RC object, and that this 
was the source of the problem. If the deletion of the payload 
occurs during the destructor call, rather than the postblit,


RcArray, a struct, already does this. You wouldn't delete in a 
postblit anyway would you? Do you need opRelease and ~this to be 
separate for structs too?


Re: RCArray is unsafe

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 22:58:19 UTC, Walter Bright wrote:
I.e. the postblit manipulates the ref count, but does NOT do 
payload deletions. The destructor checks the ref count, if it 
is zero, THEN it does the payload deletion.


Pretty dazz idea, dontcha think? And DIP25 still stands 
unscathed :-)


Unless, of course, we missed something obvious.


Add me to we. I'm dazzed! :-)


Re: RCArray is unsafe

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 22:58:19 UTC, Walter Bright wrote:
His insight was that the deletion of the payload occurred 
before the end of the lifetime of the RC object, and that this 
was the source of the problem. If the deletion of the payload 
occurs during the destructor call, rather than the postblit, 
then although the ref count of the payload goes to zero, it 
doesn't actually get deleted.


I.e. the postblit manipulates the ref count, but does NOT do 
payload deletions. The destructor checks the ref count, if it 
is zero, THEN it does the payload deletion.


I guess you also mean opAssigns -- they would manipulate 
refcounts too right? In fact, they would be the primary means of 
decrementing the refcount *apart* from the destructor, right?


Re: RCArray is unsafe

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 3 March 2015 at 00:05:50 UTC, Zach the Mystic wrote:
I guess you also mean opAssigns -- they would manipulate 
refcounts too right? In fact, they would be the primary means 
of decrementing the refcount *apart* from the destructor, right?


Nevermind. I was a minute too soon with my post!


Re: RCArray is unsafe

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 05:57:35 UTC, Walter Bright wrote:

On 3/1/2015 12:51 PM, Michel Fortin wrote:
That's actually not enough. You'll have to block access to 
global variables too:


Hmm. That's not so easy to solve.


But consider this. It's only an impure function which might alias 
a global. And since you already have access to the global in the 
impure function, there might be less incentive in general to pass 
it through a function. Other than that, you're stuck with a 
theoretical @impure!varName function attribute, for example, 
which tells the caller which globals are accessed.


Re: My Reference Safety System (DIP???)

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 08:59:11 UTC, deadalnix wrote:

On Monday, 2 March 2015 at 00:37:05 UTC, Zach the Mystic wrote:

I'm sure many inc/dec can still be removed.


Do you agree or disagree with what I said? I can't tell.


Yes, but I think this is overly conservative.


I'm arguing a rather liberal position: that only in a very 
exceptional case do you need to protect a variable for the 
duration of a function. For the most part, it's not necessary. 
What am I conserving?


Re: RCArray is unsafe

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Sunday, 1 March 2015 at 20:51:35 UTC, Michel Fortin wrote:

On 2015-03-01 19:21:57 +, Walter Bright said:

The trouble seems to happen when there are two references to 
the same object passed to a function. I.e. there can be only 
one borrowed ref at a time.


I'm thinking this could be statically disallowed in @safe code.


That's actually not enough. You'll have to block access to 
global variables too:


S s;

void main() {
s.array = RCArray!T([T()]);   // s.array's refcount is now 1
foo(s.array[0]);   // pass by ref
}
void foo(ref T t) {
s.array = RCArray!T([]);  // drop the old s.array
t.doSomething();  // oops, t is gone
}


What's the difference between that and this:

void fun() {
  T[] ta = [T()].dup;
  T* t = ta[0];
  delete ta; // or however you do it
  *t = ...;
}

Why is this a parameter passing issue and not a you kept a 
sub-reference to a deleted chunk issue?


Re: RCArray is unsafe

2015-03-02 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 15:22:33 UTC, Zach the Mystic wrote:

void fun() {
  T[] ta = [T()].dup;
  T* t = ta[0];


I meant: T* t = ta[0];


Re: My Reference Safety System (DIP???)

2015-03-01 Thread Zach the Mystic via Digitalmars-d

On Sunday, 1 March 2015 at 14:40:54 UTC, Marc Schütz wrote:

I don't think a callee-based solution can work:

class T {
void doSomething() scope;
}
struct S {
RC!T t;
}
void main() {
auto s = S(RC!T()); // `s.t`'s refcount is 1
T t = s.t;  // borrowing from the RC wrapper
foo(s);
t.doSomething();// oops, `t` is gone
}
void foo(ref S s) {
s.t = RC!T();   // drops the old `s.t`
}


I thought of this, and I disagree. The very fact of assigning to 
`T t` adds the reference count you need to keep `s.t` from 
disintegrating. As soon as you borrow, you increment the count.


Re: My Reference Safety System (DIP???)

2015-03-01 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 00:06:52 UTC, deadalnix wrote:

On Sunday, 1 March 2015 at 23:56:02 UTC, Zach the Mystic wrote:

On Sunday, 1 March 2015 at 14:40:54 UTC, Marc Schütz wrote:

I don't think a callee-based solution can work:

  class T {
  void doSomething() scope;
  }
  struct S {
  RC!T t;
  }
  void main() {
  auto s = S(RC!T()); // `s.t`'s refcount is 1
  T t = s.t;  // borrowing from the RC wrapper
  foo(s);
  t.doSomething();// oops, `t` is gone
  }
  void foo(ref S s) {
  s.t = RC!T();   // drops the old `s.t`
  }


I thought of this, and I disagree. The very fact of assigning 
to `T t` adds the reference count you need to keep `s.t` from 
disintegrating. As soon as you borrow, you increment the count.


I'm sure many inc/dec can still be removed.


Do you agree or disagree with what I said? I can't tell.


Re: Contradictory justification for status quo

2015-03-01 Thread Zach the Mystic via Digitalmars-d
On Saturday, 28 February 2015 at 23:03:23 UTC, Walter Bright 
wrote:

On 2/28/2015 2:31 AM, bearophile wrote:

Zach the Mystic:

You can see exactly how D works by looking at how Kenji 
spends his time. For a
while he's only been fixing ICEs and other little bugs which 
he knows for

certain will be accepted.


I agree that probably there are often better ways to use Kenji 
time for the

development of D.


Actually, Kenji fearlessly deals with some of the hardest bugs 
in the compiler that require a deep understanding of how the 
compiler works and how it is supposed to work. He rarely does 
trivia. I regard Kenji's contributions as invaluable to the 
community.


I don't think anybody disagrees with this. Kenji's a miracle.


Re: My Reference Safety System (DIP???)

2015-03-01 Thread Zach the Mystic via Digitalmars-d

On Monday, 2 March 2015 at 00:37:05 UTC, Zach the Mystic wrote:

On Monday, 2 March 2015 at 00:06:52 UTC, deadalnix wrote:
I thought of this, and I disagree. The very fact of assigning 
to `T t` adds the reference count you need to keep `s.t` from 
disintegrating. As soon as you borrow, you increment the 
count.


I'm sure many inc/dec can still be removed.


Do you agree or disagree with what I said? I can't tell.


I think I understand now. Yes, they can probably be optimized, 
but that's a different issue than whether you need to protect 
certain RC instances from the tyranny of a function call. My 
whole argument is that basically you don't. Only when you split 
pass directly in the call itself: fun(x,x), does this issue 
ever matter, and it's easy to deal with.


Re: Contradictory justification for status quo

2015-03-01 Thread Zach the Mystic via Digitalmars-d

On Sunday, 1 March 2015 at 11:30:52 UTC, bearophile wrote:

Walter Bright:

Actually, Kenji fearlessly deals with some of the hardest bugs 
in the compiler that require a deep understanding of how the 
compiler works and how it is supposed to work. He rarely does 
trivia. I regard Kenji's contributions as invaluable to the 
community.


But my point was that probably there are even better things 
that Kenji can do in part of the time he works on D.


I think this once again brings up the issue of what might be 
called The Experimental Space (for which std.experimental is 
the only official acknowledgment thus far).


Simply put, there are things which it would be nice to try out, 
which can be conditionally pre-approved depending on how they 
work in real life. There are a lot of things which would be great 
to have, if only some field testing could verify that they aren't 
laden with show-stopping flaws. But these represent a whole 
middle ground between pre-approved, and rejected. The middle 
ground is fraught with tradeoffs -- most prominently that if the 
field testers find the code useful it becomes the de facto 
standard *even if* fatal flaws are discovered in the design. Yet 
if you tell people honestly, this may not be the final design, 
a lot fewer people will be willing to test it. The Experimental 
Space must have a whole different philosophy about what it is -- 
the promises you make, or more accurately don't make, and the 
courage you have to reject a bad design even when it is already 
being used in real-world code.


Basically, the experimental space must claim tentatively 
approved for D, pending field testing -- and it must 
courageously stick to that claim. That might give Kenji the 
motivation to implement some interesting new approaches to old 
problems, knowing that even if in the final analysis they fail, 
they will at least get a chance to prove themselves first.


(Maybe there aren't really that many candidates for this approach 
anyway, but I thought the idea should be articulated at least.)


Re: RCArray is unsafe

2015-03-01 Thread Zach the Mystic via Digitalmars-d

On Sunday, 1 March 2015 at 20:51:35 UTC, Michel Fortin wrote:

On 2015-03-01 19:21:57 +, Walter Bright said:

The trouble seems to happen when there are two references to 
the same object passed to a function. I.e. there can be only 
one borrowed ref at a time.


I'm thinking this could be statically disallowed in @safe code.


That's actually not enough. You'll have to block access to 
global variables too:


S s;

void main() {
s.array = RCArray!T([T()]);   // s.array's refcount is now 1
foo(s.array[0]);   // pass by ref
}
void foo(ref T t) {
s.array = RCArray!T([]);  // drop the old s.array
t.doSomething();  // oops, t is gone
}


Globals to impures, that is.


Re: RCArray is unsafe

2015-03-01 Thread Zach the Mystic via Digitalmars-d

On Sunday, 1 March 2015 at 15:44:49 UTC, Marc Schütz wrote:
Walter posted an example implementation of a reference counted 
array [1], that utilizes the features introduced in DIP25 [2]. 
Then, in the threads about reference counted objects, several 
people posted examples [3, 4] that broke the suggested 
optimization of eliding `opAddRef()`/`opRelease()` calls in 
certain situations.


A weakness of the same kind affects DIP25, too. The core of the 
problem is borrowing (ref return as in DIP25), combined with 
manual (albeit hidden) memory management. An example to 
illustrate:


struct T {
void doSomething();
}
struct S {
RCArray!T array;
}
void main() {
auto s = S(RCArray!T([T()])); // s.array's refcount is 
now 1

foo(s, s.array[0]);   // pass by ref
}
void foo(ref S s, ref T T) {
s.array = RCArray!T([]);  // drop the old s.array
t.doSomething();  // oops, t is gone
}

Any suggestions how to deal with this? As far as I can see, 
there are the following options:


See:
http://forum.dlang.org/post/bghjqvvrdcfqmoiyy...@forum.dlang.org
...and:
http://forum.dlang.org/post/cviwlkugnothraubc...@forum.dlang.org


Re: Making RCSlice and DIP74 work with const and immutable

2015-03-01 Thread Zach the Mystic via Digitalmars-d
On Sunday, 1 March 2015 at 01:40:40 UTC, Andrei Alexandrescu 
wrote:
Tracing garbage collection can afford the luxury of e.g. 
mutating data that was immutable during its lifetime.


Reference counting needs to make minute mutations to data while 
references to that data are created. In fact, it's not mutation 
of the useful data, the payload of a data structure; it's 
mutation of metadata, additional information about the data 
(i.e. a reference count integral).


The RCOs described in DIP74 and also RCSlice discussed in this 
forum need to work properly with const and immutable. 
Therefore, they need a way to reliably define and access 
metadata for a data structure.


One possible solution is to add a @mutable or @metadata 
attribute similar to C++'s keyword mutable. Walter and I both 
dislike that solution because it's hamfisted and leaves too 
much opportunity for abuse - people can essentially create 
unbounded amounts of mutable payload for an object claimed to 
be immutable. That makes it impossible (or unsafe) to optimize 
code based on algebraic assumptions.


We have a few candidates for solutions, but wanted to open with 
a good discussion first. So, how do you envision a way to 
define and access mutable metadata for objects (including 
immutable ones)?


I need to get educated on this issue. First suggestion: Just 
break the type system by encouraging the idiom of using casts in 
opAddRef and opRelease. It's too easy, but I don't know why.


Re: DIP74 updated with new protocol for function calls

2015-03-01 Thread Zach the Mystic via Digitalmars-d

On Sunday, 1 March 2015 at 07:04:09 UTC, Zach the Mystic wrote:

class RcType {...}

void fun(RcType1 c, RcType1 d);

auto x = new RcType;

fun(x, x);

If the compiler were smart, it would realize that by splitting 
parameters this way, it's actually adding an additional 
reference to x. The function should get one x for free, and 
then force an opAdd/opRelease, for every additional x (or x 
derivative) it detects in the same call.


One more tidbit:

class RcType {
  RcType r;
  ...
}

void fun(RcType x, RcType y);

auto z = new RcType;
z.r = new RcType;

fun(z, z.r);

From within fun(), z can alias z.r, but z.r can't possibly alias 
z. Thus, only z.r needs to be preserved. The algorithm should go 
For each parameter, add one ref/release cycle for every other 
parameter which could possibly generate an alias to it.


We're approaching optimal here. This is feeling good to me.


Re: Improving DIP74: functions borrow by default, retain only if needed

2015-02-28 Thread Zach the Mystic via Digitalmars-d
On Friday, 27 February 2015 at 21:21:08 UTC, Andrei Alexandrescu 
wrote:

On 2/27/15 1:02 PM, Michel Fortin wrote:

On 2015-02-27 20:34:08 +, Steven Schveighoffer said:

void main()
{
   C2 c2 = new C2;
   c2.c = new C;
   foo(c2.c, c2);
}

Still same question. The issue here is how do you know that 
the
reference that you are sure is keeping the thing alive is not 
going to

release it through some back door.

There are surely other cases, but you get the idea. These three
situations are probably the most common, especially the first 
one. For
instance, inside a member function, 'this' is a local variable 
and you
will never pass it to another function by ref, so it's safe to 
call

'this.otherFunction()' without retaining 'this' first.


Thanks. So it seems we continue as we were with DIP74 and leave 
the rest to the implementation.


Hey, I don't think so. I think I figured it out. Keep track in 
house of which parameters get opReleased, and have the compiler 
insert addRef and opRelease at entry and exit to the function 
itself. No performance penalty, no parameter attribute, no 
nothin'. Just an in-house tracking mechanism. Eh???


Re: My Reference Safety System (DIP???)

2015-02-28 Thread Zach the Mystic via Digitalmars-d

On Saturday, 28 February 2015 at 20:49:22 UTC, Marc Schütz wrote:

Any other ideas and opinions?


I'm a little busy. It'll take me some time. There's a lot going 
on in recent days with all these ideas.


Re: DIP74 updated with new protocol for function calls

2015-02-28 Thread Zach the Mystic via Digitalmars-d
On Saturday, 28 February 2015 at 21:12:54 UTC, Andrei 
Alexandrescu wrote:

Defines a significantly better function call protocol:

http://wiki.dlang.org/DIP74

Andrei


This is obviously much better, Andrei.

I think an alternative solution (I know -- another idea -- 
against my own first idea!) is to keep track of this from the 
caller's side. The compiler, in this case, when copying a 
ref-counted type (or derivative) into a parameter, would actually 
check to see if it's splitting the variable in two. Look at this:


class RcType {...}

void fun(RcType1 c, RcType1 d);

auto x = new RcType;

fun(x, x);

If the compiler were smart, it would realize that by splitting 
parameters this way, it's actually adding an additional reference 
to x. The function should get one x for free, and then force an 
opAdd/opRelease, for every additional x (or x derivative) it 
detects in the same call.


This might be even better than the improved current proposal. The 
real key is realizing that duplicating an lvalue into the same 
function call is subtly adding a new reference to it.


Eh??


Re: My Reference Safety System (DIP???)

2015-02-28 Thread Zach the Mystic via Digitalmars-d

On Saturday, 28 February 2015 at 20:49:22 UTC, Marc Schütz wrote:
I encountered an ugly problem. Actually, I had already run into 
it in my first proposal, but Steven Schveighoffer just posted 
about it here, which made me aware again:


http://forum.dlang.org/thread/mcqcor$aa$1...@digitalmars.com#post-mcqk4s:246qb:241:40digitalmars.com

class T {
void doSomething() scope;
}
struct S {
RC!T t;
}
void main() {
auto s = S(RC!T()); // `s.t`'s refcount is 1
foo(s, s.t);// borrowing, no refcount changes
}
void foo(ref S s, scope T t) {
s.t = RC!T();   // drops the old `s.t`
t.doSomething();// oops, `t` is gone
}


One quick thing. I suggest a solution here:

http://forum.dlang.org/post/jycylhdhdewtgumba...@forum.dlang.org

You do the checking and adding in the called function, not the 
caller. The algorithm:


1. Keep a compile-time refcount per function. Does the parameter 
get released, i.e. does the refcount ever go below 1? If not, 
stop.


2. Can the parameter contain (as a member) a reference to a 
refcounted struct of the types of any of the other parameters? If 
not, stop.


3. Okay, you need to preserve the reference. Add a call to opAdd 
at the beginning and one to opRelease at the end of the function. 
Done.


Re: My Reference Safety System (DIP???)

2015-02-27 Thread Zach the Mystic via Digitalmars-d

On Friday, 27 February 2015 at 23:18:24 UTC, Marc Schütz wrote:
I think I have an inference algorithm that works. It can infer 
the required scope levels for local variables given the 
constraints of function parameters, and it can even infer the 
annotations for the parameters (in template functions). It can 
also cope with local variables that are explicitly declared as 
`scope`, though these are mostly unnecessary.


Interestingly, the rvalue/lvalue problem deadalnix found is 
only relevant during assignment checking, but not during 
inference. That's because we are free to widen the scope of 
variables that are to be inferred as needed.


It's based on two principles:

* We start with the minimum possible scope a variable may have, 
which is empty for local variables, and its own lifetime for 
parameters.
* When a scoped value is stored somewhere, it is then reachable 
through the destination. Therefore, assuming the source's scope 
is fixed, the destination's scope must be widened to 
accommodate the source's scope.
* From the opposite viewpoint, a value that is to be stored 
somewhere must have at least the destination's scope. 
Therefore, assuming the destination's scope is fixed, the 
source's scope needs to be widened accordingly.


I haven't formalized it yet, but I posted a very detailed 
step-by-step demonstration on my wiki talk page (nicer to read 
because it has syntax highlighting):

http://wiki.dlang.org/User_talk:Schuetzm/scope2


I need to sleep as well right now. But I still don't understand 
where the cycles come from. Taken from your example:


*b = c;
// assignment from `c`:
// = SCOPE(c) |= SCOPE(*b)
// = DEFER because SCOPE(*b) = SCOPE(b) is incomplete

`c` is merely being copied, but you indicate here that it will 
now inherit b's (or some part of b's) scope. Why would c's scope 
inherit b's when it is merely being copied and not written to?


Re: DIP74: Reference Counted Class Objects

2015-02-27 Thread Zach the Mystic via Digitalmars-d
On Thursday, 26 February 2015 at 21:50:56 UTC, Andrei 
Alexandrescu wrote:
http://wiki.dlang.org/DIP74 got to reviewable form. Please 
destroy and discuss.


Thanks,

Andrei


It's kind of funny that you were looking for an edge to my safety 
system -- I'll admit I don't know whether it really has an edge 
or not (it might be too bloated, both function-signature-wise and 
compile-time-wise) -- but one key advantage to any sophisticated 
ownership system is that automated reference counting can elide 
calls which it knows are unnecessary. What struck me in 
particular about DIP74 is how the pass-by-value protocol will 
force many function calls to endure an opAddRef/opRelease cycle, 
even if they do nothing to the reference count.


What really worries me is that if the caller is responsible for 
the opAddRef, while the callee is responsible for the opRelease, 
isn't the potential optimization of eliding them just being 
sacrificed?


Re: Contradictory justification for status quo

2015-02-27 Thread Zach the Mystic via Digitalmars-d
On Friday, 27 February 2015 at 14:02:58 UTC, Andrei Alexandrescu 
wrote:
Safety is good to have, and the simple litmus test is if you 
slap @safe: at the top of all modules and you use no @trusted 
(or of course use it correctly), you should have memory safety, 
guaranteed.


A feature that is safe except for certain constructs is 
undesirable.


It seems like you're agreeing with my general idea of going the 
whole hog.


Generally having a large number of corner cases that require 
special language constructs to address is a Bad Sign.


But D inherits C's separate compilation model. All these cool 
function and parameter attributes (pure, @safe, return ref, etc.) 
could be kept hidden and just used and they would Just Work if D 
didn't have to accommodate separation compilation. From my 
perspective, the only Bad Sign is that D has to navigate the 
tradeoff between:


* concise function signatures
* accurate communication between functions
* enabling separate compilation

It's like you have to sacrifice one to get the other two. 
Naturally I'm not keen on this, so I rush to see how far 
attribute inference for all functions can be taken. Then Dicebot 
suggests automated .di file generation with statically verified 
matching binaries:


http://forum.dlang.org/post/otejdbgnhmyvbyaxa...@forum.dlang.org

The point is that I don't feel the ominous burden of a Bad Sign 
here, because of the inevitability of this conflict.


Re: Improving DIP74: functions borrow by default, retain only if needed

2015-02-27 Thread Zach the Mystic via Digitalmars-d
On Friday, 27 February 2015 at 18:24:27 UTC, Andrei Alexandrescu 
wrote:
DIP74's function call protocol for RCOs has the caller insert 
opAddRef for each RCO passed by value. Then the callee has the 
responsibility to call opRelease (or defer that to another 
entity). This choice of protocol mimics the 
constructor/destructor protocol and probably shows our C++ bias.


However, ARC does not do that. Instead, it implicitly assumes 
the callee is a borrower of the reference. Only if the callee 
wants to copy the parameter to a member or a global (i.e. save 
it beyond the duration of the call), a new call to retain() (= 
opAddRef) is inserted. That way, functions that only need to 
look at the object but not store it incur no reference call 
overhead.


So I was thinking of changing DIP74 as follows:

* Caller does NOT insert an opAddRef for byval RCOs

* Callee does NOT insert an opRelease for its byval RCO 
parameters


It seems everything will just work with this change (including 
all move scenarios), but it is simple enough to make me worry 
I'm missing something. Thoughts?


I think it's fine. I couldn't even figure out the original motive 
for wanting to add those calls -- I thought it must have 
something to do with threads or exceptions or something, but even 
then I couldn't figure it out. Any reference argument will, by 
definition, outlive its function -- it can't possibly die within 
the function itself, since the caller still thinks it's a valid 
reference.


Another thing is that local references in general need not 
participate in reference counting. They will retain and release 
the reference automatically when they go in and out of scope. I'm 
really no expert (except that I like to study and think and by 
thinking become somewhat expert it appears), but if all ARC could 
be confined to global/heap = global/heap copies, you'd get the 
most efficient code. And I'm not trying to advertise a reference 
tracking system :-), but the real hiccup is that global reference 
can go *through* the stack and land back at a global... and you 
would need to keep track of that.


Re: Contradictory justification for status quo

2015-02-27 Thread Zach the Mystic via Digitalmars-d

On Friday, 27 February 2015 at 15:35:46 UTC, H. S. Teoh wrote:

@safe has some pretty nasty holes right now... like:

https://issues.dlang.org/show_bug.cgi?id=5270
https://issues.dlang.org/show_bug.cgi?id=8838


My new reference safety system:

http://forum.dlang.org/post/offurllmuxjewizxe...@forum.dlang.org

...would solve the above two bugs. In fact, it's designed 
precisely for bugs like those. Here's your failing use case for 
bug 5270. I'll explain how my system would track and catch the 
bug:


int delegate() globDg;

void func(scope int delegate() dg) {
globDg = dg; // should be rejected but isn't
globDg();
}

If func is marked @safe and no attribute inference is permitted, 
this would error, as it copies a reference parameter to a global. 
However, let's assume we have inference. The signature would now 
be inferred to:


void func(noscope scope int delegate() dg);

Yeah it's obviously weird having both `scope` and `noscope`, but 
that's pure coincidence, and moreover, I think the use of `scope` 
here would be made obsolete by my system anyway. (Note also that 
the `noscope` bikeshed has been suggested to be painted `static` 
instead -- it's not about the name, yet... ;-)


void sub() {
int x;
func(() { return ++x; });
}

Well I suppose this rvalue delegate is allocated on the stack, 
which will have local reference scope. This is where you'd get 
the safety error in the case of attribute inference, as you can't 
pass a local reference to a `noscope` parameter. The rest is just 
a foregone conclusion (added here for completion):


void trashme() {
import std.stdio;
writeln(globDg()); // prints garbage
}

void main() {
sub();
trashme();
}

The next bug, 8838, is a very simple case, I think:

int[] foo() @safe
{
int[5] a;
return a[];
}

`a`, being a static array, would have a reference scope depth of 
1, and when you copy the reference to make a dynamic array in the 
return value, the reference scope inherits that of `a`. Any scope 
system would catch this one, I'm afraid. Mine seems like overkill 
in this case. :-/


Re: Improving DIP74: functions borrow by default, retain only if needed

2015-02-27 Thread Zach the Mystic via Digitalmars-d
On Friday, 27 February 2015 at 20:30:20 UTC, Steven Schveighoffer 
wrote:
OK, I found the offending issue. It's when you pass a 
parameter, the only reference holding onto it may be also 
passed as well. Something like:


void foo(C c, C2 c2)
{
   c2.c = null; // this destroys 'c' unless you opAddRef it 
before passing

   c.someFunc(); // crash
}

void main()
{
C c = new C; // ref counted class
C2 c2 = new C2; // another ref counted class
c2.c = c;
foo(c, c2);
}

How does the compiler know in this case that it *does* have to 
opAddRef c before calling? Maybe your ARC expert can explain 
how that works.


Split-passing nested ref-counted classes with null loads! How 
insidious!


Re: Improving DIP74: functions borrow by default, retain only if needed

2015-02-27 Thread Zach the Mystic via Digitalmars-d
On Friday, 27 February 2015 at 21:21:08 UTC, Andrei Alexandrescu 
wrote:

On 2/27/15 1:02 PM, Michel Fortin wrote:

On 2015-02-27 20:34:08 +, Steven Schveighoffer said:


On 2/27/15 3:30 PM, Steven Schveighoffer wrote:


void main()
{
C c = new C; // ref counted class
C2 c2 = new C2; // another ref counted class
c2.c = c;
foo(c, c2);
}


Bleh, that was dumb.

void main()
{
   C2 c2 = new C2;
   c2.c = new C;
   foo(c2.c, c2);
}

Still same question. The issue here is how do you know that 
the
reference that you are sure is keeping the thing alive is not 
going to

release it through some back door.


You have to retain 'c' for the duration of the call unless you 
can prove
somehow that calling the function will not cause it to be 
released. You

can prove it in certain situations:

- you are passing a local variable as a parameter and nobody 
has taken a
mutable reference (or pointer) to that variable, or to the 
stack frame

(be wary of nested functions accessing the stack frame)
- you are passing a global variable as a parameter to a pure 
function
and aren't giving to that pure function a mutable reference to 
that

variable.
- you are passing a member variable as a parameter to a pure 
function
and aren't giving to that pure function a mutable reference to 
that

variable or its class.

There are surely other cases, but you get the idea. These three
situations are probably the most common, especially the first 
one. For
instance, inside a member function, 'this' is a local variable 
and you
will never pass it to another function by ref, so it's safe to 
call

'this.otherFunction()' without retaining 'this' first.


Thanks. So it seems we continue as we were with DIP74 and leave 
the rest to the implementation.


Andrei


Still seems like a very significant performance penalty for such 
a strange case. It probably won't surprise you that I would 
suggest another parameter attribute to the rescue, 
e.g.`@rcRelease`! Inter-function communication for the win!


Re: My Reference Safety System (DIP???)

2015-02-27 Thread Zach the Mystic via Digitalmars-d

On Friday, 27 February 2015 at 22:10:11 UTC, Marc Schütz wrote:

I put my own version into the Wiki, building on yours:
http://wiki.dlang.org/User:Schuetzm/scope2

It's quite similar to what you propose (at least as far as I 
understand it), and there are a few further user-facing 
simplifications, and provisions for backward compatibility. I 
intentionally kept it as concise as possible; there are neither 
justifications for particular decisions, nor any implementation 
details, nor examples. These can be added later.


I like this phrase: Because all relevant information about 
lifetimes is contained in the function signature... This keeps 
seeming more and more important to me. There's no other place 
functions can talk to each other -- and they *really* need to 
talk to each other for any of these advanced features to work 
well. I'm pretty sure it's really the function signature which 
needs designing -- what to add, what can be deduced (and 
therefore not added), and how to express them all elegantly and 
simply. And of course, my favorite Castle in the Sky: attribute 
inference!


I won't really know how your proposal works until I see code 
examples.


For me, it's important to keep the implementation details and 
algorithms separate from the basic workings. Otherwise it's 
hard for me to fully understand it in all aspects.


Okay, but hopefully some examples are forthcoming, cause they 
help *me* think.


Re: Contradictory justification for status quo

2015-02-27 Thread Zach the Mystic via Digitalmars-d

On Friday, 27 February 2015 at 21:09:51 UTC, H. S. Teoh wrote:

https://issues.dlang.org/show_bug.cgi?id=12822
https://issues.dlang.org/show_bug.cgi?id=13442
https://issues.dlang.org/show_bug.cgi?id=13534
https://issues.dlang.org/show_bug.cgi?id=13536
https://issues.dlang.org/show_bug.cgi?id=13537
https://issues.dlang.org/show_bug.cgi?id=14136
https://issues.dlang.org/show_bug.cgi?id=14138

There are probably other holes that we haven't discovered yet.


I wanted to say that besides the first two bugs I tried to 
address, none of the rest in your list involves more than just 
telling the compiler to check for this or that, whatever the case 
may be, per bug. Maybe blanket use of `@trusted` to bypass an 
over-cautious compiler is the only real danger I personally am 
able to worry about.


I simplified my thinking by dividing everything into in 
function and outside of function. So I ask, within a function, 
what do I need to know to ensure everything is safe? And then, 
from outside a function, what do I need to know to ensure 
everything is safe? The function has inputs and outputs, sources 
and destinations.


Re: My Reference Safety System (DIP???)

2015-02-26 Thread Zach the Mystic via Digitalmars-d
On Thursday, 26 February 2015 at 16:40:27 UTC, Zach the Mystic 
wrote:

int r; // declaration scopedepth(0)

void fun(int a /*scopedepth(0)*/) {
int b; // depth(1)
{
  int c; // depth(2)
  {
int d; // (3)
  }
  {
int e; // (3)
  }
}
int f; // (1)
}



You have element of differing lifetime at scope depth 0 so far.


Sorry for the delay.

I made a mistake. Parameter `a` will have a *declaration* scope 
of 1, just like int b above. It's *reference* scope will have 
depth 0, with the mystery bit for the first parameter set.


That is, `a` would have such a reference scope is it were a 
reference type... :-)


Re: My Reference Safety System (DIP???)

2015-02-26 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 25 February 2015 at 18:08:55 UTC, deadalnix wrote:
On Wednesday, 25 February 2015 at 01:12:15 UTC, Zach the Mystic 
wrote:

int r; // declaration scopedepth(0)

void fun(int a /*scopedepth(0)*/) {
 int b; // depth(1)
 {
   int c; // depth(2)
   {
 int d; // (3)
   }
   {
 int e; // (3)
   }
 }
 int f; // (1)
}



You have element of differing lifetime at scope depth 0 so far.


Sorry for the delay.

I made a mistake. Parameter `a` will have a *declaration* scope 
of 1, just like int b above. It's *reference* scope will have 
depth 0, with the mystery bit for the first parameter set.


Principle 5: It's always un@safe to copy a declaration scope 
from a higher scopedepth to a reference variable stored at 
lower scopedepth. DIP69 tries to banish this type of thing 
only in `scope` variables, but I'm not afraid to banish it in 
all @safe code period:


void gun() @safe {
 T* t; // t's declaration depth: 1
 T u;
 {
   T* uu = u; // fine, this is normal
   T tt;
   t = tt; // t's reference depth: 2, error, un@safe
 }
 // now t is corrupted
}



Bingo. However, when you throw goto into the mix, weird thing 
happens. The general idea is good but need refining.


I addressed this further down, in Principle 10. My proposed 
solution has the compiler detecting the presence of code which 
could both 1) be visited again (through a jump label or a loop) 
and 2) is in a branching condition. In these cases it pushes any 
statement which copies a reference onto a special stack. When the 
branching condition finishes, it revisits the stack, reheating 
the scopes in reverse order. If there is a way to defeat this 
technique, it must be very convoluted, since the scopes do 
nothing but accumulate possibilities. It may even be 
mathematically impossible.


Principle 7: In this system, all scopes are *transitive*: any 
reference type with double indirections inherits the scope of 
the outermost reference. Think of it this way:




It is more complex than that, and this is where most proposals 
fail short (including this one and DIP69). If you want to 
disallow the assignment of a reference to something with a 
short lifetime, you can't consider scope transitive when used 
as a lvalue. You can, however, consider it transitive when used 
as an rvalue.


The more general rule is that you want to consider the largest 
possible lifetime of an lvalue, and the smallest possible one 
for an rvalue.


When going through an indirection, that will differ, unless we 
choose to tag all indirections, which is undesirable.


I'm unclear about what you're saying. Can you give an example in 
code?


Principle 8: Any time a reference is copied, the reference 
scope inherits the *maximum* of the two scope depths:




That makes control flow analysis easier, so I can buy this :)

Principle 8: We don't need to know! For all intents and 
purposes, a reference parameter has infinite lifetime for the 
duration of the function it is compiled in. Whenever we copy 
any reference, we do a bitwise OR on *all* of the mystery 
scopes. The new reference accumulates every scope it has ever 
had access to, directly or indirectly.




That would allow to copy a parameter reference to a global, 
which is dead unsafe.


Actually, it's not unsafe, so long as you have the parameter 
attribute `noscope` (or possibly `static`) working for you:


void fun(T* a) {
  static T* t;
  *t = a; // this might be safe
}

The truth is, this *might* be safe. It's only unsafe if the 
parameter `a` is located on the stack. From within the function, 
the compiler can't possibly know this. But if it forces you to 
mark `a` with `noscope` (or is allowed to infer the same), it 
tells the caller all it needs to know about `a`. Simply put, it's 
an error to pass a local to a `noscope` parameter. And it runs 
all the way down: any parameter which it itself passed to a 
`noscope` parameter must also be marked `noscope`. (Note: I'm 
actually preferring the name `static` at this point, but using 
`noscope` for consistency):


T* fun(noscope T* a) {
  static T* t;
  *t = a; // this might be safe
}

void tun(T* b) {
  T c;
  fun(c); // error, local
  fun(b); // error, unless b also marked (or inferred) `noscope`
}

There is some goodness in there. Please address my comment and 
tell me if I'm wrong, but I think you didn't covered all bases.


The only base I'm really worried about is the lvalue vs rvalue 
base. Hopefully we can fix that!


Re: My Reference Safety System (DIP???)

2015-02-26 Thread Zach the Mystic via Digitalmars-d
On Thursday, 26 February 2015 at 16:42:30 UTC, Zach the Mystic 
wrote:
That is, `a` would have such a reference scope is it were a 
reference type... :-)


s/is/if/

I seem to be making one more mistake for every mistake I correct.


Re: My Reference Safety System (DIP???)

2015-02-26 Thread Zach the Mystic via Digitalmars-d

On Wednesday, 25 February 2015 at 21:26:33 UTC, Marc Schütz wrote:
IIRC H.S. Teoh suggested a change to the compilation model. I 
think he wants to expand the minimal compilation unit to a 
library or executable. In that case, inference for all kinds of 
attributes will be available in many more circumstances; 
explicit annotation would only be necessary for exported 
symbols.


You probably mean Dicebot:

http://forum.dlang.org/post/otejdbgnhmyvbyaxa...@forum.dlang.org

Anyway, it is a good idea to enable scope semantics implicitly 
for all references involved in @safe code. As far as I 
understand it, this is something you suggest, right? It will 
eliminate annotations except in cases where a parameter is 
returned, which - as you note - will probably be acceptable, 
because it's already been suggested in DIP25.


Actually you could eliminate `return` parameters as well, I 
think. If the compiler has the body of a function, which it 
usually does, then there shouldn't be a need to mark *any* of the 
covariant function or parameter attributes. I think it's the kind 
of thing which should Just Work in all these cases.


Principle 4: Scopes. My system has its own notion of scopes. 
They are compile time information, used by the compiler to 
ensure safety. Every declaration which holds data at runtime 
must have a scope, called its declaration scope. Every 
reference type (defined below in Principle 6) will have an 
additional scope called its reference scope. A scope 
consists of a very short bit array, with a minimum of 
approximately 16 bits and reasonable maximum of 32, let's say. 
For this proposal I'm using 16, in order to emphasize this 
system's memory efficiency. 32 bits would not change anything 
fundamental, only allow the compiler to be a little more 
precise about what's safe and what's not, which is not a big 
deal since it conservatively defaults to @system when it 
doesn't know.


This bitmask seems to be mostly an implementation detail.


I guess I'm trying to win over the people who might think the 
system will cost too much memory or compilation time.


AFAIU, further below you're introducing some things that make 
it visible to the user.


The only things I'm making visible to the user are things which 
*must* appear in the function signature for the sake of the 
separate compilation model. Everything else would be invisible, 
except the occasional false positive, where something actually 
safe is thought unsafe (the solution being to enclose the 
statement in a @trusted black or lambda).


I'm not convinced this is a good idea; it looks complicated for 
sure.


It's not that complicated. My main fear is that it's too simple! 
Some of the logic may seem complicated, but the goal is to make 
it possible to compile a function without having to visit any 
other function. Everything is figured out in house.


I also think it is too coarse. Even variables declared at the 
same lexical scope have different lifetimes, because they are 
destroyed in reverse order of declaration. This is relevant if 
they contain references and have destructors that access the 
references; we need to make sure that no reference to a 
destroyed variable can be kept in a variable whose destructor 
hasn't yet run.


It might be too coarse. We could reserve a few more bits for 
depth-constant declaration order. At the same, time, it doesn't 
seem *that* urgent to me. But maybe I'm naive about this. 
Everything is being destroyed anyway, so what's the real danger?


Principle 5: It's always un@safe to copy a declaration scope 
from a higher scopedepth to a reference variable stored at 
lower scopedepth. DIP69 tries to banish this type of thing 
only in `scope` variables, but I'm not afraid to banish it in 
all @safe code period:


For backwards compatibility reasons, it might be better to 
restrict it to `scope` variables. But as all references in 
@safe code should be implicitly `scope`, this would mostly have 
the same effect.


I guess this is the Language versus Legacy issue. I think D's 
strength is in it's language, not its huge legacy codebase. 
Therefore, I find myself going with the #pleasebreakourcode 
crowd, for the sake of extending D's lead where it shines. I'm 
not sure all references in safe code need to be `scope` - that 
would break a lot of code unto itself, right?



Principle 8: Any time a reference is copied, the reference

  ^^^
  Principle 7 ?

scope inherits the *maximum* of the two scope depths:

T* gru() {
 static T st; // decl depth(0)
 T t; // decl depth(1)
 T* tp = t; // ref depth(1)
 tp = st; // ref depth STILL (1)
 return tp; // error!
}

If you have ever loaded a reference with a local scope, it 
retains that scope level permanently, ensuring the safety of 
the reference.


Why is this rule necessary? Can you show an example what could 
go wrong without it? I assume it's just there to ease 
implementation (avoids the need for data flow analysis)?


You're right. It's only 

Re: Memory safety depends entirely on GC ?

2015-02-26 Thread Zach the Mystic via Digitalmars-d

Here's my best so far:

http://forum.dlang.org/post/offurllmuxjewizxe...@forum.dlang.org

On Tuesday, 24 February 2015 at 20:53:24 UTC, Walter Bright wrote:

My criticisms of it centered around:

1. confusion about whether it was a storage class or a type 
qualifier.


My system has neither. Instead, it just bans unsafe reference 
copying in @safe code.


2. I agree with Andrei that any annotation system can be made 
to work - but this one (as are most annotation systems) also 
struck me as wordy, tedious, and aesthetically unappealing. I 
just can't see myself throwing it up on a slide and trying to 
sell it to the audience as cool.


3. In line with (2), I want a system that relies much more on 
inference. We've made good progress with the existing 
annotations being inferred.


Well you know I'm on board with this. The one penalty my system 
requires is two more parameter attributes, which I'm hoping can 
be alleviated by inference as much as possible.


4. I didn't see how one could, for example, have an array of 
pointers:


int*[] pointers;

and then fill that array with pointers of varying ownership 
annotations.


5. The (4) homogeneity requirement would mean that templated 
types would get new instantiations every time they are used 
with a different ownership. This could lead to massive code 
bloat.


I deliberately designed my system to avoid all associations with 
type. No code bloat.


6. The 'return ref' scheme, which you have expressed distaste 
for, was one that required the fewest instances of the user 
having to add an annotation. It turned out that upgrading 
Phobos to this required only a handful of annotations.


7. 'return ref' makes memory safe ref counted types possible, 
finally, in D, without needing to upend the language or legacy 
code. And as the example I posted showed, they are 
straightforward to write. Only time and experience will tell if 
this will be successful, but it looks promising and I hope 
you'll be willing to give it a chance.


I do give it a chance! See my proposal!


Re: My Reference Safety System (DIP???)

2015-02-26 Thread Zach the Mystic via Digitalmars-d

On Thursday, 26 February 2015 at 20:46:07 UTC, deadalnix wrote:

Consider :

void foo(T** a) {
T** b = a; // OK
T*  = ...;
*b = c; // Legal because of your transitive clause,
// but not safe as a can have an
// arbitrary large lifetime.
}


This example's incomplete, but I can guess you meant something 
like this:


void foo(T** a) {
T** b = a; // OK
T d;
T* c = d;
*b = c; // Legal because of your transitive clause,
// but not safe as a can have an
// arbitrary large lifetime.
}

This show that anything you reach through an indirection can 
have from the same lifetime as the indirection up to an 
infinite lifetime (and anything in between). When using it as 
an lvalue, you should consider the largest possible lifetime, 
when using it as an rvalue, you should consider the smallest 
(this is the only way to be safe).


I'm starting to see what you mean. I guess it's only applicable 
to variables with double (or more) indirections (e.g. T**, T***, 
etc.), since only they can lose information with transitive 
scopes. Looks like we need a new rule: variables assigning to one 
of their double indirections cannot acquire a scope-depth greater 
than (or lifetime less than) their current one. Does that fix the 
problem?


Re: My Reference Safety System (DIP???)

2015-02-26 Thread Zach the Mystic via Digitalmars-d

On Thursday, 26 February 2015 at 21:33:53 UTC, Marc Schütz wrote:
On Thursday, 26 February 2015 at 17:56:14 UTC, Zach the Mystic 
wrote:
On Wednesday, 25 February 2015 at 21:26:33 UTC, Marc Schütz 
wrote:

struct A {
B* b;
~this() {
b.doSomething();
}
}

struct B {
void doSomething();
}

void foo() {
A a;  // declscope(1)
B b;  // declscope(1)
a.b = b; // refscope(1) = declscope(1): OK
// end of scope:
// `b` is destroyed
// `a`'s destructor is called
// = your calling a method on a destroyed object
}

Basically, every variable needs to get its own declscope; all 
declscopes form a strict hierarchy (no partial overlaps).


Well, technically you only need one per variable with a 
destructor. Fortunately, this doesn't seem hard to add. Just 
another few bits, allowing as many declarations with destructors 
as seem necessary (4 bits = 15 variables, 5 bits = 31 variables, 
etc.), with the last being treated conservatively as unsafe. (I 
think anyone declaring 31+ variables with destructors in a 
function, and taking the addresses of those variables has bigger 
problems than memory safety!)


I guess this is the Language versus Legacy issue. I think 
D's strength is in it's language, not its huge legacy 
codebase. Therefore, I find myself going with the 
#pleasebreakourcode crowd, for the sake of extending D's lead 
where it shines.


I'm too, actually, but it would be a really hard sell.


But look, Walter and Andrei were fine with adding `return ref` 
parameters. There's hope yet!


I'm not sure all references in safe code need to be `scope` - 
that would break a lot of code unto itself, right?


Not sure how much would be affected. I actually suspect that 
most of it already behaves as if it were scope, with the 
exception of newly allocated memory. But those should ideally 
be owned instead.


But your right, there still needs to be an opt-out possibility, 
most likely static.


I don't even have a use for `scope` itself in my proposal. The 
only risk I'm running is a lot of false positives -- safe 
constructs which the detection mechanism conservatively treats as 
unsafe because it can't follow the program logic. Still, it's 
hard for me to imagine even these appearing very much. And they 
can be put into @trusted lambdas -- all @trusted functions are 
treated as if they copy no references, effectively canceling any 
parameter attributes to the contrary.



T* fun(T* a, T** b) {
 T* c = new T;
 c = a;
 *b = c;
 return c;
}


Algorithm for inference of ref scopes (= parameter annotations):

1) Each variable, parameter, and the return value get a ref 
scope (or ref depth). A ref scope can either be another 
variable (including `return` and `this`) or `static`.


2) The initial ref scope of variables is themselves.


Actually, no. The *declaration* scope is themselves. The initial 
ref scope is whatever the variable is initialized with, or just 
null if nothing. We could even have a bit for could be null. 
You might get some null-checking out of this for free. But then 
you'd need more attributes in the signature to indicate could be 
null! But crashing due to null is not considered a safety issue 
(I think!), so I haven't gone there yet.


3) Each time a variable (or something reachable through a 
variable) is assigned (returning is assignment to the return 
value), i.e. for each location in the function that an 
assignment happens, the new scope ref will be:


3a) the scope of the source, if it is larger or equal to the 
old scope


If scope depth is = 1, you inherit the maximum of the source and 
the target. If it's 0, you do a bitwise OR on the mystery scopes 
(unless the compiler can easily prove it doesn't need to), so you 
can accumulate all possible origins of the assigned-to scope.


3b) otherwise (for disjunct scopes, or assignment from smaller 
to larger scope), it is an error (could potentially violate 
guarantees)


I don't have disjunct scopes. There's just greater than and 
less than. The mystery scopes are for figuring out what the 
parameter attributes are, and in the absence of inference, 
causing errors in safe code for the parameters not being 
accurately marked.


4) If a source scope refers to a variable (apart from the 
destination itself), for which not all assignments have been 
processed yet, it is put into a queue, to be evaluated later. 
For code like `a = b; b = a;` there can be dependency cycles. 
Such code will be disallowed.


No, my system is simpler. I want to make this proposal appealing 
from the implementation side as well as from the language side. 
You analyze the code in lexical order:


T* dum(T* a) {
  T* b = a; // b accumulates a
  return b; // okay... lexical ordering, b has a only
  T c;
  b = c; // now b accumulates scopedepth(1);
  return b; // error here, but *only* here
}

The whole process relies on accumulating the scopes as the 
compiler encounters them. There are cases of branching 
conditional, 

Re: My Reference Safety System (DIP???)

2015-02-26 Thread Zach the Mystic via Digitalmars-d

On Friday, 27 February 2015 at 00:44:21 UTC, deadalnix wrote:
On Thursday, 26 February 2015 at 22:45:19 UTC, Zach the Mystic 
wrote:
I'm starting to see what you mean. I guess it's only 
applicable to variables with double (or more) indirections 
(e.g. T**, T***, etc.), since only they can lose information 
with transitive scopes. Looks like we need a new rule: 
variables assigning to one of their double indirections cannot 
acquire a scope-depth greater than (or lifetime less than) 
their current one. Does that fix the problem?


Cool. I think that can work (I'm not 100% convinced, but at 
least something close to that should work). But that is 
probably too limiting.


Hence the proposed differentiation of lvalue and rvalues.


Yeah, wasn't completely clear. I meant to say:

Variables assigning to one of their double indirections cannot 
acquire a scope-depth greater than (or lifetime less than) their 
current longest-lived one. Also, bear in mind, a parameter could 
be an lvalue:


void fun(T* a, T** b) {
  *b = a;
}

I guess its just better to use source and targer than lvalue 
and rvalue.


Also bear in mind that in the worst case scenario, any code can 
be made to work by putting it into the newly approved-of idiom: 
The @trusted Lambda! We want a safety mechanism conservative 
enough to catch all failures, accurate enough to avoid too many 
false positives (thus minimizing @trusted lambdas), easy enough 
to implement, and which doesn't tax compile time too heavily. The 
magic Four! I still have a few doubts (recursive inference, for 
example, which can probably be improved), but not too many.


Re: Contradictory justification for status quo

2015-02-26 Thread Zach the Mystic via Digitalmars-d
On Friday, 27 February 2015 at 01:33:58 UTC, Jonathan M Davis 
wrote:
Well, I suspect that each case would have to be examined 
individually to
decide upon the best action, but I think that what it comes 
down to is the
same problem that we have with getting anything done around 
here - someone

has to do it.


This isn't true at all. Things need to be approved first, then 
implemented.


With language changes, it's often the same. Someone needs to 
come up with a
reasonable solution and then create a PR for it.  They then 
have a much
stronger position to argue from, and it may get in and settle 
the issue.


I sometimes feel so bad for Kenji, who has come up with several 
reasonable solutions for longstanding problems, *and* implemented 
them, only to have them be frozen for *years* by indecision at 
the top. I'll never believe your side until this changes. You can 
see exactly how D works by looking at how Kenji spends his time. 
For a while he's only been fixing ICEs and other little bugs 
which he knows for certain will be accepted. I'm not saying any 
of these top level decisions are easy, but I don't believe you 
for a second, at least when it comes to the language itself. 
Phobos may be different.


Re: Contradictory justification for status quo

2015-02-26 Thread Zach the Mystic via Digitalmars-d
On Friday, 27 February 2015 at 02:58:31 UTC, Andrei Alexandrescu 
wrote:
I'm following with interest the discussion My Reference Safety 
System (DIP???). Right now it looks like a lot of work - a 
long opener, subsequent refinements, good discussion. It also 
seems just that - there's work but there's no edge to it yet; 
right now a DIP along those ideas is more likely to be rejected 
than approved. But I certainly hope something good will come 
out of it. What I hope will NOT happen is that people come to 
me with a mediocre proposal going, We've put a lot of Work 
into this. Well?


Can I ask you a general question about safety: If you became 
convinced that really great safety would *require* more function 
attributes, what would be the threshold for including them? I'm 
trying to go the whole hog with safety, but I'm paying what 
seems to me the necessary price -- more parameter attributes. 
Some of these gains (out! parameters, e.g.) seem like they 
would only apply to very rare code, and yet they *must* be there, 
in order for functions to talk to each other accurately.


Are you interested in accommodating the rare use cases for the 
sake of robust safety, or do you just want to stop at the very 
common use cases (ref returns, e.g.)? ref returns will 
probably cover more than half of all use cases for memory safety. 
Each smaller category will require additions to what a function 
signature can contain (starting with expanding `return` to all 
reference types, e.g.), while covering a smaller number of actual 
use cases... but on the other hand, it's precisely because they 
cover fewer use cases that they will appear so much less often.


Re: Memory safety depends entirely on GC ?

2015-02-24 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 24 February 2015 at 12:44:54 UTC, Marc Schütz wrote:
On Monday, 23 February 2015 at 18:16:38 UTC, Andrei 
Alexandrescu wrote:
On 2/23/15 6:56 AM, Marc =?UTF-8?B?U2Now7x0eiI=?= 
schue...@gmx.net wrote:
These two points have undesirable consequences: All consumers 
such

objects need to be aware of the exact type, which includes the
management strategy (RC, Unique, GC). But this is a violation 
of the
principle of separation of concerns: a consumer shouldn't 
need to have
information about the management strategy, it should work 
equally with
`RefCounted!C`, `Unique!C` and bare (GC) `C`, as long as it 
doesn't take

ownership of the resource.


Well I don't know of another way.


Ok, I wrote my reply assuming that you are aware of the various 
proposals deadalnix, myself and several other people have made 
in the past, some of them quite specific. But now that I think 
of it, I don't remember that you were ever directly referring 
to it in any of your posts. Maybe you just missed it?


As one example, here is what I originally suggested:
http://wiki.dlang.org/User:Schuetzm/scope

It's not completely up to date, during discussions I gained 
many useful new insights to simplify it and make things more 
consistent. It's also part of a bigger picture (deadalnix's 
ideas about ownership play an important role, too), which 
unfortunately isn't easy to recognize, because this page has 
become quite large und unwieldy. I should make a post 
explaining this.


I'm working on my own idea now. I make scope transitive, because 
it's both memory safe and simple to implement, but doing so may 
cause some things which are actually safe to be considered unsafe 
(but then you could just use @system blocks or @trusted lambdas 
to correct this). Also, I don't think `scope` needs to be part of 
the type.


I'm about 90 percent sure, 10 percent unsure that my system will 
work. I'll have it soon enough. It needs DIP25 to be expanded to 
all reference types (not just `ref`), requires my own DIP71, 
http://wiki.dlang.org/DIP71 for total safety, and possibly one or 
two more additions for a reliable ownership. The only real cost 
is added complexity to function signatures (a la DIP25), which 
can and should be inferred in most cases, assuming we aren't 
crippled by an ancient and subpar linking mechanism which 
requires all this manual marking of signatures all the time.


Stay tuned, sir!


My Reference Safety System (DIP???)

2015-02-24 Thread Zach the Mystic via Digitalmars-d
So I've been thinking about how to do safety for a while, and 
this is how I would do it if I got to start from scratch. I think 
it can be harnessed to D, but I'm worried that people will be 
confused by it, or that there might be a show-stopping use case I 
haven't thought of, or that it is simply too cumbersome to be 
taken seriously, but I'll make a DIP when it overcomes these 
three obstacles.


I'm feeding off the momentum built by the approval of DIP25, and 
off of other recent `scope` proposals:

http://wiki.dlang.org/DIP25
http://wiki.dlang.org/User:Schuetzm/scope
http://wiki.dlang.org/DIP69

This system goes farther than either DIP25 or DIP69 towards 
complete safety, but is simpler and easier to implement I (I 
think) than Mark Schutz's and deadalnix's proposal. It is not an 
ownership or reference counting system, but can serve as the 
foundation to one. Which leads to...


Principle 1: Memory safety is indispensable to ownership, but not 
the other way around. Memory safety focuses on all the things 
which *might* happen, and casts a wide net, akin to an algebraic 
union, whereas ownership targets specific things, focuses on what 
*will* happen, and is akin to the algebraic intersection of 
things. I will therefore present the memory safety system first, 
leave grafting an ownership system on top of it for later.


Principle 2: The Function is the key unit of memory safety. The 
compiler must never need to leave the function it is compiling to 
verify that it is safe. This means that no information important 
to safety can be excluded from the signatures of the functions 
that the compiling function is calling. This principle has 
already been conceded in part by Walter and Andrei's acceptance 
of `return ref` parameters in DIP25, which simply implements the 
most common use case where safety is needed. Here I am taking 
this principle to the extreme, in the interest of total safety. 
But speaking of function signatures,


Principle 3: Extra function and parameter attributes are the 
tradeoff for great memory safety. There is no other way to 
support both encapsulation of control flow (Principle 2) and the 
separate-compilation model (indispensable to D). Function 
signatures pay the price for this with their expanding size. I 
try to create the new attributes for the rare case, as opposed to 
the common one, so that they don't appear very often.


Principle 4: Scopes. My system has its own notion of scopes. They 
are compile time information, used by the compiler to ensure 
safety. Every declaration which holds data at runtime must have a 
scope, called its declaration scope. Every reference type 
(defined below in Principle 6) will have an additional scope 
called its reference scope. A scope consists of a very short 
bit array, with a minimum of approximately 16 bits and reasonable 
maximum of 32, let's say. For this proposal I'm using 16, in 
order to emphasize this system's memory efficiency. 32 bits would 
not change anything fundamental, only allow the compiler to be a 
little more precise about what's safe and what's not, which is 
not a big deal since it conservatively defaults to @system when 
it doesn't know.


So what are these bits? Reserve 4 bits for an unsigned integer 
(range 0-15) I call scopedepth. Scopedepth is easier for me to 
think about than lifetime, of which it is simply the inverse, 
with (0) scopedepth being infinite lifetime, 1 having a lifetime 
at function scope, etc. Anyway, a declaration's scopedepth is 
determined according to logic similar that found in DIP69 and 
Mark Schutz's proposal:


int r; // declaration scopedepth(0)

void fun(int a /*scopedepth(0)*/) {
  int b; // depth(1)
  {
int c; // depth(2)
{
  int d; // (3)
}
{
  int e; // (3)
}
  }
  int f; // (1)
}

Principle 5: It's always un@safe to copy a declaration scope from 
a higher scopedepth to a reference variable stored at lower 
scopedepth. DIP69 tries to banish this type of thing only in 
`scope` variables, but I'm not afraid to banish it in all @safe 
code period:


void gun() @safe {
  T* t; // t's declaration depth: 1
  T u;
  {
T* uu = u; // fine, this is normal
T tt;
t = tt; // t's reference depth: 2, error, un@safe
  }
  // now t is corrupted
}

So you'd have to enclose t = tt; above in a @trusted lambda or 
a @system block. The truth is, it is absurd to copy the address 
of something with shorter lifetime into something with longer 
lifetime... what use would you ever have for it in the 
longer-lived variable? I'm therefore simplifying the system by 
making all instances of this unsafe.


Looking at Principle 5, I realize I forgot:

Principle 6: Reference variables: Any data which stores a 
reference is a reference variable. That includes any pointer, 
class instance, array/slice, `ref` parameter, or any struct 
containing any of those. For the sake of simplicity, I boil _all_ 
of these down to T* in this proposal. All reference types are 

Re: Trusted Manifesto

2015-02-10 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 10 February 2015 at 15:49:24 UTC, Zach the Mystic
wrote:
Wait a second... you're totally right. That is a cool solution. 
The only hiccup is that it might be hard to implement in the 
compiler because of flow tracking (i.e. the error comes before 
the @system block, forcing a recheck of all preceding 
functions.).


I mean all preceding statements.


Re: Trusted Manifesto

2015-02-10 Thread Zach the Mystic via Digitalmars-d

On Tuesday, 10 February 2015 at 16:04:05 UTC, Marc Schütz wrote:
On Tuesday, 10 February 2015 at 15:57:28 UTC, Zach the Mystic 
wrote:
On Tuesday, 10 February 2015 at 15:49:24 UTC, Zach the Mystic 
wrote:
As already pointed out in the other thread, there is a 
non-breaking variant of (3):


3a. Keep named @trusted functions, allow @system blocks 
inside them, but only treat those with @system blocks with 
the new semantics.


But they *have* no semantics without disallowing @system 
code in the rest of the @trusted function.


Wait a second... you're totally right. That is a cool 
solution. The only hiccup is that it might be hard to 
implement in the compiler because of flow tracking (i.e. the 
error comes before the @system block, forcing a recheck of 
all preceding functions.).


I'm sorry I misread you at first -- this is actually really 
cool (notwithstanding the hiccup)!


No problem!

At first I thought it was only a nice deprecation path, but I 
realised that the intermediate stage could even be kept 
indefinitely.


Eventually the error should be the default, I say, but even then, 
a compiler switch can be kept around indefinitely which turns the 
error off.


It probably wouldn't be too complicated to implement, because 
semantic analysis already happens in several stages. I think 
@safe checks happen relatively late, which means that there has 
already been one complete traversal of the functions AST which 
can take a note whenever it sees an @system block.


Well that's just jolly!


  1   2   3   >