dsimcha wrote:
== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
3. A T[new] should be implicitly convertible to a slice. For example:
auto foo = someFunctionThatReturnsTnew();
// foo is a T[new].
T[] bar = someFunctionThatReturnsTnew();
// Works. bar is a T[]. The T[new] went into oblivion.
This solves the problem of slices not being closed over .dup and ~.
Check.
So then why is slices not being closed over .dup, ~, etc. still a problem? With
implicit conversion, they for all practical purposes are.
The problems are with auto and template argument deduction.
4. It should be guaranteed that no block of memory is ever referenced by more
than one T[new] instance. This is needed to guarantee safety when appending to
immutable arrays, etc.
That doesn't go with reference semantics. Uncheck.
5. Assigning a T[new] to another T[new] should be by reference, just like
assigning a class instance to another class instance.
Check. (BTW contradicts 4)
There was a slight misunderstanding here. When I say an instance, I mean one
T[new] heap object == one instance, no matter how many pointers/references to
this
instance you have. The assumption is that T[new] objects live entirely on the
heap and have semantics similar to classes. For example:
uint[new] foo = new uint[100];
uint[new] bar = foo; // Really just a pointer assignment.
Now, both foo and bar point to the same uint[new] instance. If anything is
modified via foo, it is seen when viewed from bar and vice-versa.
To clarify, no *main block of memory that holds the data in the array* is ever
referenced by more than one T[new] *heap object*.
Oh yah. Check that sucker.
Assigning a T[] to a T[new]
should duplicate the memory block referenced by the T[] because this is probably
the only way to guarantee (4).
No check, but could have been done.
Actually, this can even be done by COW. Initially, assigning a T[] to a T[new]
could just be a pointer assignment and setting a flag, as long as a copy of the
data is made when you try to increase the length of the T[new] or append to it.
Basically, as long as you use the T[new] as if it were a T[], no copying needs
to
be done.
6. Since T[new] guarantees unique access to a memory block, it should have an
assumeUnique() method that returns an immutable slice and sets the T[new]'s
reference to the memory block to null. This solves the problem of building
immutable arrays without the performance penalty of not being able to
pre-allocate
or the unsafeness of having to cowboy cast it to immutable.
Uncheck.
Now that I've explained my uniqueness thing better, does this sound feasible?
Nah, there are still problems with Ts that themselves are elaborate
types containing references, aliasing etc. assumeUnique makes a rather
strong claim.
7. As long as the GC is conservative, there absolutely *must* be a method of
manually freeing the memory block referenced by a T[new] provided that the GC
supports this operation, though it doesn't have to be particularly pretty. In
general, since D is a systems language, T[new] should not be too opaque. A good
way to do this might be to make all of the fields of the T[new] public but
undocumented. If you *really* want to mess with it, you'll read the source code
and figure it out.
Check while delete still exists. Please use malloc for that stuff.
As long as I can get at the pointer to the memory block held by T[new] I can
pass
it to GC.free. All that would really be needed is a .ptr property that does
something like what .ptr does for T[]s.
Alrighty. Check that.
So now: every place I've said "check" means there was implementation and
book-quality illustrated documentation that Walter and I have done with
the sweat of our brow. At the end we looked at the result and concluded
we should throw away all that work.
Andrei
This begs the question: Why? Walter's post on the subject was rather brief
and I
can't understand for the life of me why you guys would throw away such an
elegant
solution.
Here's what I wrote to Walter:
====================
I'm going to suggest something terrible - let's get rid of T[new]. I
know it's difficult to throw away work you've already done, but really
things with T[new] start to look like a Pyrrhic victory. Here are some
issues:
* The abstraction doesn't seem to come off as crisp and clean as we both
wanted;
* There are efficiency issues, such as the two allocations that you
valiantly tried to eliminate in a subset of cases;
* Explaining two very similar but subtly different types to newcomers is
excruciatingly difficult (I'll send you a draft of the chapter - it
looks like a burn victim who didn't make it);
* Furthermore, explaining people when to use one vs. the other is much
more difficult than it seems. On the surface, it goes like this: "If you
need to append stuff, use T[new]. If not, use T[]." Reality is much more
subtle. For one thing, T[new] does not allow contraction from the left,
whereas T[] does. That puts T[] at an advantage. So if you want to
append stuff and also contract from the left, there's nothing our
abstractions can help you with.
Instead of all T[new] stuff, I suggest the following:
1. We stay with T[] and we define a struct ArrayBuilder that replaces
T[new] with a much more clear name and charter. Phobos already has
Appender which works very well. We can beef that up to allow array-like
primitives.
2. Assigning to a slice's .length allocates a new slice if growth is needed.
3. Disallow ~= for slices. ArrayBuilder will define it.
4. That's it.
Java got away with a similar approach using StringBuilder:
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuilder.html
Scala has something very similar called ArrayBuffer:
http://www.nabble.com/ArrayList-and-ArrayBuffer-td15448842.html
And guess what, C# stole Java's StringBuilder as well:
http://msdn.microsoft.com/en-us/library/2839d5h5%28VS.71%29.aspx
So it looks like many programmers coming from other languages will
already be familiar with the idea that you use a "builder" to grow an
array, and then you use a non-growable array. One thing that Appender
has and is really cool is that it can grow an already-existing slice. So
you can grow a slice, play with it for a while, and then grow it again
at low cost. I don't think the other languages allow that.
I understand how you must feel about having implemented T[new] and all,
but really please please try to detach for a minute and think back. Does
what we've got now with T[new] make D a much better place? Between the
increase of the language, the difficulty to explain the minute
subtleties, and the annoying corner cases and oddities, I think it might
be good to reconsider.
================
Given that we already agree that a T[new] can be implicitly cast to a
T[], the lack of closure under ~, .dup, etc. seems like a non-issue. When I
was a
beginner, before I got into templates, a major attraction to D was that it was a
fast, compiled language where basic things like arrays (mostly) just worked. To
take away all the syntactic sugar for appending and lengthening from arrays and
push this into a separate library type that doesn't feel like a first class
object
or (I assume) even support most array semantics would be a massive, massive
kludge
no matter how well-implemented that library type was.
I think the language will be in considerably better shape without T[new]
than with it. I understand this is a subjective point. You can argue
against on virtually every one of my objections.
Andrei