I've noticed recently that there seems to be a bit of confusion about the fate
of ~[T] with an impending implementation of DST on the horizon. This has been
accompanied with a number of pull requests to completely remove many uses of
~[T] throughout the standard distribution. I'd like to take some time to
straighten out what's going on with Vec<T> and ~[T].

# Vec<T>

In a post-DST world, Vec<T> will be the "vector builder" type. It will be the
only type for building up a block of contiguous elements. This type exists
today, and lives inside of std::vec. Today, you cannot index Vec<T>, but this
will be enabled in the future once the indexing traits are fleshed out.

This type will otherwise largely not change from what it is today. It will
continue to occupy three words in memory, and continue to have the same runtime
semantics.

# ~[T]

The type ~[T] will still exist in a post-DST, but its representation will
change. Today, a value of type ~[T] is one word (I'll elide the details of this
for now). After DST is implemented, ~[T] will be a two-word value of the length
and a pointer to an array (similarly to what slices are today). The ~[T] type
will continue to have move semantics, and you can borrow it to &[T] as usual.

The major difference between today's ~[T] type and a post-DST ~[T] is that the
push() method will be removed. There is no knowledge of a capacity in the
representation of a ~[T] value, so a push could not be supported at all. In
theory a pop() can be efficiently supported, but it will likely not be
implemented at first.

# [T]

As part of DST, the type grammar will start accepting [T] as a possible
substitute for type parameters. This basically means that if your type
parameters is &T, then &[U] can satisfy the type parameter.

While possible, I imagine that it will be rare for this to appear in apis. This
is an unsized type, which means that it's more limited what you can do with it
than you can with a sized type.

The full details of [T] will become apparent once DST is implemented, but it's
safe to say that APIs and usage should rarely have to deal with this type, and
it will likely be mostly transparent.

# Converting between Vec<T> and ~[T]

Conversions between these two types will be provided, and the default
implementations will be free. Converting from Vec<T> to ~[T] will be simply
forgetting the capacity, and converting from ~[T] to Vec<T> will set the
capacity to the length.

Helper methods will likely be provided to perform a forceful reallocating
shrink when going from Vec<T> to ~[T], but it will not be the default.

## The cost of Vec<T> => ~[T]

Some concerns have been brought up that this can in theory be a costly
transition under the assumption that this does a reallocation of memory to
shrink to the capacity to exactly the length. This will likely not be the
default implementation.

Some concerns have then been brought up that some allocators require the size
of the allocation to be passed to free(), and that this model is incompatible
with that flavor of allocator. We believe that this fear can be
alleviated with a "shrink if necessary" method on allocators. The default
allocator (backed by the system malloc) would be a no-op because the size to
free is not used. Allocators which use the size passed to free would actually
perform a reallocation.

# Choosing between Vec<T> and ~[T]

Primarily, if you need a growable vector, you should use Vec<T>. If you do not
need a growable vector, but you're instead just dealing with an array of items,
then you should use ~[T].

As a concrete example, I'll take the read_to_end() method on io's Reader trait.
This type must use a Vec<T> internally to read data into the vector, but it will
return a ~[T] because the contents are conceptually frozen after they have been
read.

There is no blanket right decision to choose between Vec<T> and ~[T], this will
need to be done on a case-by-case basis to evaluate whether apis should take or
consume Vec<T> or ~[T].

# Moving Forward

In order to implement DST, it is not necessary to remove all usage of ~[T]
today. It is necessary to remove all *growable* usage of ~[T], however. All uses
of vectors which need growable or shrinkable vectors need to switch to Vec<T>.
If a vector does not need to be grown or shrunk, it can remain as ~[T].

Concretely speaking, the next steps forward for ~[T] would entail:

* Add a Vec<T> -> ~[T] conversion. This will be an expensive conversion today
  because it requires an allocation (due to the layout of today's ~[T]), but it
  will not be expensive in the future.
* Add a ~[T] -> Vec conversion. Like the above step, this will also be
  expensive, but it will not be so in the future.
* Remove the `push` and `pop` families of methods from ~[T]


Hopefully that clears up any mystery surrounding what's happening with ~[T] and
Vec<T>! If you have any questions, feel free to respond to this email or to join
us in IRC.
_______________________________________________
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to