Re: The demise of T[new]

Andrei Alexandrescu Sun, 18 Oct 2009 18:20:16 -0700

dsimcha wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article

3.  A T[new] should be implicitly convertible to a slice.  For example:


auto foo = someFunctionThatReturnsTnew();
// foo is a T[new].
T[] bar = someFunctionThatReturnsTnew();
// Works.  bar is a T[].  The T[new] went into oblivion.

This solves the problem of slices not being closed over .dup and ~.

Check.


So then why is slices not being closed over .dup, ~, etc. still a problem?  With
implicit conversion, they for all practical purposes are.


The problems are with auto and template argument deduction.

4.  It should be guaranteed that no block of memory is ever referenced by more
than one T[new] instance.  This is needed to guarantee safety when appending to
immutable arrays, etc.

That doesn't go with reference semantics. Uncheck.

5.  Assigning a T[new] to another T[new] should be by reference, just like
assigning a class instance to another class instance.

Check. (BTW contradicts 4)


There was a slight misunderstanding here.  When I say an instance, I mean one
T[new] heap object == one instance, no matter how many pointers/references to 
this
instance you have.  The assumption is that T[new] objects live entirely on the
heap and have semantics similar to classes.  For example:

uint[new] foo = new uint[100];
uint[new] bar = foo;  // Really just a pointer assignment.

Now, both foo and bar point to the same uint[new] instance.  If anything is
modified via foo, it is seen when viewed from bar and vice-versa.

To clarify, no *main block of memory that holds the data in the array* is ever
referenced by more than one T[new] *heap object*.


Oh yah. Check that sucker.

Assigning a T[] to a T[new]
should duplicate the memory block referenced by the T[] because this is probably
the only way to guarantee (4).

No check, but could have been done.


Actually, this can even be done by COW.  Initially, assigning a T[] to a T[new]
could just be a pointer assignment and setting a flag, as long as a copy of the
data is made when you try to increase the length of the T[new] or append to it.
Basically, as long as you use the T[new] as if it were a T[], no copying needs 
to
be done.

6.  Since T[new] guarantees unique access to a memory block, it should have an
assumeUnique() method that returns an immutable slice and sets the T[new]'s
reference to the memory block to null.  This solves the problem of building
immutable arrays without the performance penalty of not being able to 
pre-allocate
or the unsafeness of having to cowboy cast it to immutable.

Uncheck.


Now that I've explained my uniqueness thing better, does this sound feasible?

Nah, there are still problems with Ts that themselves are elaboratetypes containing references, aliasing etc. assumeUnique makes a ratherstrong claim.

7.  As long as the GC is conservative, there absolutely *must* be a method of
manually freeing the memory block referenced by a T[new] provided that the GC
supports this operation, though it doesn't have to be particularly pretty.  In
general, since D is a systems language, T[new] should not be too opaque.  A good
way to do this might be to make all of the fields of the T[new] public but
undocumented.  If you *really* want to mess with it, you'll read the source code
and figure it out.

Check while delete still exists. Please use malloc for that stuff.


As long as I can get at the pointer to the memory block held by T[new] I can 
pass
it to GC.free.  All that would really be needed is a .ptr property that does
something like what .ptr does for T[]s.


Alrighty. Check that.

So now: every place I've said "check" means there was implementation and
book-quality illustrated documentation that Walter and I have done with
the sweat of our brow. At the end we looked at the result and concluded
we should throw away all that work.
Andrei


This begs the question:  Why?  Walter's post on the subject was rather brief 
and I
can't understand for the life of me why you guys would throw away such an 
elegant
solution.


Here's what I wrote to Walter:

====================

I'm going to suggest something terrible - let's get rid of T[new]. Iknow it's difficult to throw away work you've already done, but reallythings with T[new] start to look like a Pyrrhic victory. Here are someissues:

* The abstraction doesn't seem to come off as crisp and clean as we bothwanted;

* There are efficiency issues, such as the two allocations that youvaliantly tried to eliminate in a subset of cases;

* Explaining two very similar but subtly different types to newcomers isexcruciatingly difficult (I'll send you a draft of the chapter - itlooks like a burn victim who didn't make it);

* Furthermore, explaining people when to use one vs. the other is muchmore difficult than it seems. On the surface, it goes like this: "If youneed to append stuff, use T[new]. If not, use T[]." Reality is much moresubtle. For one thing, T[new] does not allow contraction from the left,whereas T[] does. That puts T[] at an advantage. So if you want toappend stuff and also contract from the left, there's nothing ourabstractions can help you with.


Instead of all T[new] stuff, I suggest the following:

1. We stay with T[] and we define a struct ArrayBuilder that replacesT[new] with a much more clear name and charter. Phobos already hasAppender which works very well. We can beef that up to allow array-likeprimitives.


2. Assigning to a slice's .length allocates a new slice if growth is needed.

3. Disallow ~= for slices. ArrayBuilder will define it.

4. That's it.

Java got away with a similar approach using StringBuilder:

http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuilder.html

Scala has something very similar called ArrayBuffer:

http://www.nabble.com/ArrayList-and-ArrayBuffer-td15448842.html

And guess what, C# stole Java's StringBuilder as well:

http://msdn.microsoft.com/en-us/library/2839d5h5%28VS.71%29.aspx

So it looks like many programmers coming from other languages willalready be familiar with the idea that you use a "builder" to grow anarray, and then you use a non-growable array. One thing that Appenderhas and is really cool is that it can grow an already-existing slice. Soyou can grow a slice, play with it for a while, and then grow it againat low cost. I don't think the other languages allow that.

I understand how you must feel about having implemented T[new] and all,but really please please try to detach for a minute and think back. Doeswhat we've got now with T[new] make D a much better place? Between theincrease of the language, the difficulty to explain the minutesubtleties, and the annoying corner cases and oddities, I think it mightbe good to reconsider.

================

Given that we already agree that a T[new] can be implicitly cast to a
T[], the lack of closure under ~, .dup, etc. seems like a non-issue.  When I 
was a
beginner, before I got into templates, a major attraction to D was that it was a
fast, compiled language where basic things like arrays (mostly) just worked.  To
take away all the syntactic sugar for appending and lengthening from arrays and
push this into a separate library type that doesn't feel like a first class 
object
or (I assume) even support most array semantics would be a massive, massive 
kludge
no matter how well-implemented that library type was.

I think the language will be in considerably better shape without T[new]than with it. I understand this is a subjective point. You can argueagainst on virtually every one of my objections.



Andrei

Re: The demise of T[new]

Reply via email to