On 2008-11-09 10:10:03 -0500, Andrei Alexandrescu
<[EMAIL PROTECTED]> said:
Michel Fortin wrote:
I'd like to point out that the two things people complained the most
about regarding the automatic dynamic allocation for dynamic closures:
1. There is no way to prevent it, to make sure there is no allocation.
2. The compiler does allocate a lot more than necessary.
In my proposal, these two points are addressed:
1. You can declare any variable as "scope", preventing it from being placed
in a broader scope, preventing at the same time dynamic allocation.
2. The compiler being aware of what arguments do and do not escape the
scope of the called functions, it won't allocate unnecessarily.
So I think the situation would be much better.
I agree that an escape analyzer would improve things. I am not sure
that one oblivious to regions is expressive enough.
If you think I proposed a region-oblivious scheme, then you've got me
wrong (and perhaps it's my fault for not explaining well enough). Let
me explain again, and I'll try to not skip anything this time.
Cyclone has dynamic regions, regions which are allocated on the heap
but that are deleted at the end of the scope that created them.
Basically, those are scoped heaps offering a very useful system to
automatically free memory. (It's somewhat similar in concept to Cocoa's
NSAutoReleasePool for instance.) The downside of them is that you need
to pass region handle around (so called functions can allocate objects
within them).
So my first point is that since we have a garbage collector in D, and
moreover since we're likely to get one heap per thread in D2, we don't
need dynamic regions. The remaining regions are: 1) the shared heap, 2)
the thread-local heap, 3) All the stack frames; and you can't allocate
other stack frames than the current one. Because none of these regions
require a handle to allocate into, we (A) don't need region handles.
We still have many regions. Beside the two heaps (shared,
thread-local), each function's stack frame, and each block within them,
creates a distinct memory region. But nowhere we need to know exactly
which region a function parameter comes from; what we need to know is
which address outlives which pointer, and then we can forbid assigning
addresses to pointers that outlive them. All we need is a relative
ordering of the various regions, and for that we don't need to attach
*names* to the regions so that you can refer explicitly to them in the
syntax. Instead, you could say something like "region of (x)", or
"region of (*y)" and that would be enough.
So there is still a region for every pointer, only regions don't need
to be *named* because you can always refer to them by referring to the
variables. (And perhaps the syntax would be clearer with region names
than without, in which case I don't mind we use them. But they're not
required for the concept to work.)
I'm not too thrilled by references. I once got a question from someone
coming from C: what is the difference between a pointer and a reference
in C++? I had to answer: references are pointers with a different
syntax, no rebindability, and no possibility of being null. It seems he
and I both agree that references are mostly a cosmetic patch to solve a
syntactic problem. References in D aren't much different.
I disagree. References in D are very different. They are not type
constructors. They are storage classes that can only be used in
function signatures, which makes them impossible to dangle. I think C++
references would also have been much better off as storage classes
instead of half-life types.
Which makes me think of this:
struct A { int i; this(); }
ref A foo(ref A a) { return a; }
ref A bar()
{
foo(A()).i = 1;
ref A a = foo(A()); // illegal, ref cannot be used outside
function signature
a.i = 1;
return foo(A()); // illegal ?
}
Also, I'd like to point out that ref (and out) being storage classes
somewhat hinder me from using them where it makes sense in the
D/Objective-C bridge, since there most functions are instanciated by
templates where template arguments give the type of each function
argument. Perhaps there should be a way to specify "ref" and "out" in
template arguments...
If we could have a unified syntax for pointers of all kinds, I think
it'd be more convenient than having two kinds of pointers. A
null-forbiding but rebindable pointer would be more useful in my
opinion than the current reference concept.
Well ref means "This function wants to modify its argument". That is a
very different charter from what pointers mean. So I'm not sure how you
say you'd much prefer this to that. They are not comparable.
I was under the impression that ref would be allowed as a storage class
for local variables. I'll say it's perfectly acceptable for function
arguments, but I'm less sure about function return types.
Also, I'd still like to have a non-null pointer type, especially for
clarifying function sigatures. A template can do. If it was in the
language however it be used by more people, which would be better.
But I'd be curious what others think of it. Notice how the discussion
participants got reduced to you and me, and from what I saw that's not
a good sign.
Indeed. I'm interested in other opinions too.
But I'm under the impression that many lost track of what was being
discussed, especially since we started referring to Cyclone which few
are familiar with and probably few have read the paper.
In my experience, when someone is interested in something, she'd make
time for it. So I take that as lack of interest. And hey, since when
was lack of expertise a real deterrent? :o)
As I said below, I think many people in this group are already
confortable with using pointers, which may explain why they're not so
interested. Having no one interested in something doesn't necessarly
mean they won't appreciate it when it comes.
It does, however reduce the incitative for continuing forward. So I
understand why you're backing off, even if it displease me somewhat.
One of the fears expressed at the start of the thread was about
excessive need for annotation, but as the Cyclone paper say, with good
defaults, you need to add scoping annotation only to a few specific
places. (It took me some time to read the paper and start discussing
things sanely after that, remember?) So perhaps we could get more
people involved if we could propose a tangible syntax for it.
To be very frank, I think we are very far from having an actual
proposal, and syntax is of very low priority now if you want to put one
together. Right now what we have is a few vague ideas and conjectures
(e.g., there's no need for named regions because the need would be rare
enough to require dynamic allocation for those cases). I'm not saying
that to criticize, but merely to underline the difficulties.
I never said the need for dynamic regions would be rare: I said garbage
collector obsoletes it. If we can justify the need for dynamic regions
later, we can add them back (with all the added complexity it requires)
but I'd try without them first.
Or perhaps not; for advanced programmers who already understand well
what can and cannot be done by passing pointers around, full escape
analysis may not seem to be a so interesting gain since they've already
adopted the right conventions to avoid most bugs it would prevent. And
most people here who can discuss this topic with some confidence are
not newbies to programming and don't make too much mistakes of the sort
anymore.
Which makes me think of beginners saying pointers are hard. You've
certainly seen beginners struggle as they learn how to correctly use
pointers in C or C++. Making sure their program fail at compile-time,
with an explicative error message as to why they mustn't do this or
that, is certainly going to help their experience learning the language
more than cryptic and frustrating segfaults and access violations at
runtime, sometime far from the source of the problem.
I totally agree that pointers are hard and good static checking for
them would help. Currently, what we try to do is obviate the need for
pointers in most cases, and to actually forbid them in safe modules.
But dynamic arrays *are* pointers, how are you oblivating the need for
them? If you find a solution for dynamic arrays, you'll have a solution
for pointers too.
You could forbid dynamic arrays from refering to stack-allocated static
ones, or automatically dynamically allocate those when they escape in a
dynamic array. And if I were you, whatever you choose for arrays I'd
allow it for pointers too, to keep things consistent. Pointer to heap
objects should be retained in my opinion.
The question that remains is, how many unsafe modules are necessary,
and what liability do they entail? If there are few and not too
unwieldy, maybe we can declare victory without constructing an escape
analyzer. I agree if you or anyone says they don't think so. At this
point, I am not sure, but what I can say is that it's good to reduce
the need for pointers regardless.
But are you reducing the need for pointers or hiding and restricting
them? I'd say the later. Reference are pointers with restrictions.
Object references are no different from pointer except in syntax (they
can even point to stack allocated objects with scope classes). Dynamic
arrays are pointers with a certain range. Closure have a pointer to a
stack frame, which can be heap-allocated or not.
The only way to have a safe system without escape analysis is to force
everything they can point to to be on the heap, or prevent them from
escaping at all (as with ref). I which there could be some consistency
here.
--
Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/