Re: [rust-dev] Reminder: ~[T] is not going away

Huon Wilson Wed, 02 Apr 2014 14:51:55 -0700

Personally, I'm strongly against doing using ~[] as return values fromlibrary functions.

Imagine we were in world were we only had Vec<T> and were adding a newtype OwnedSlice<T> that was (pointer, length) like ~[T]. For how manylibrary functions would we say "it is sensible to throw away thecapacity information before returning"? I don't think anything in libstdetc. would have a strong 'yes' answer to this question.

Specifically, I don't see any concrete positives to doing this forlibrary functions other than "lets keep using ~[T]" and ~[T] & &[T]having the same in-memory representation (covered below).


Under any scheme I can think of, there are negatives:

1. without calling shrink_to_fit in the conversion, we lose the abilityto have sized deallocations (covered by others in this thread)

2. if we do call it, then anything returning a ~[T] after building itwith a Vec<T> is unavoidably slower

3. either way, you're throwing away (the knowledge of) any extracapacity that was allocated, so if someone wishes to continue extendingthe slice returned by e.g. `foo`, then `let v = foo().into_vec();v.push(1)` will always require a realloc. (And for library functions, weshouldn't be dictating how people use the return values.)

4. it adds two vector-like types that someone needs to think about: inthe common case the benefits of ~[] (one word smaller) are completelyuseless, it's really only mostly-immutable heavily-nested data typeswith a lot of vectors like Rust's AST where it helps[1]. I.e. almost all

situations are fine (or better) with a Vec.

5. how will the built-in ~[] type use allocators? (well, I guess this isreally "how will the built-in ~ type use allocators?", but that questionstill needs answering[2].)

On the representation of ~[T] and &[T] being the same: this means thattheoretically a ~[T] in covariant(?) position can be coerced to a &[T],e.g. Vec<~[T]> -> Vec<&[T]>. However, this only really matters forfunctions returning many nested slices/vectors, e.g. the same Vecexample, because pretty much anything else will be able to write`vec.as_slice()` cheaply. (In the code base, the only things mentioning/~[~[/ now are a few tests and things handling the raw argc/argv, i.e.returning ~[~[u8]].)

I don't think this should be a major concern, because I don't see ussuddenly growing functions a pile of new functions returning ~[~[T]],and if we do, I would think that they would be better suited to being aniterator (assuming that's possible) over Vec's, and these internal Veccan be then be mapped to ~[T] cheaply before collecting the iterator toa whole new Vec<Vec> (or Vec<~[]>) (assuming a &[Vec]/&[~[]] is wanted).

I'm concerned we are wanting to stick with ~[T] because it's what wecurrently have, and is familiar; as I said above, I don't see manypositives for doing it for library functions.





Huon

[1]: And even in those cases, it's not a particularly huge gain, e.g.taking *two* words off the old OptVec type by replacing it with alibrary equivalent to DST's ~[T] only gained about 40MB:http://huonw.github.io/isrustfastyet/mem/#f5357cf,bbf8cdc

[2]: The sanest way to support allocators I can think of would bechanging `~T` to `Uniq<T, A=DefaultAlloc>`, and then we have `Uniq<[T]>`which certainly feels less attractive than `~[T]`.


On 03/04/14 02:35, Alex Crichton wrote:

I've noticed recently that there seems to be a bit of confusion about the fate
of ~[T] with an impending implementation of DST on the horizon. This has been
accompanied with a number of pull requests to completely remove many uses of
~[T] throughout the standard distribution. I'd like to take some time to
straighten out what's going on with Vec<T> and ~[T].

# Vec<T>

In a post-DST world, Vec<T> will be the "vector builder" type. It will be the
only type for building up a block of contiguous elements. This type exists
today, and lives inside of std::vec. Today, you cannot index Vec<T>, but this
will be enabled in the future once the indexing traits are fleshed out.

This type will otherwise largely not change from what it is today. It will
continue to occupy three words in memory, and continue to have the same runtime
semantics.

# ~[T]

The type ~[T] will still exist in a post-DST, but its representation will
change. Today, a value of type ~[T] is one word (I'll elide the details of this
for now). After DST is implemented, ~[T] will be a two-word value of the length
and a pointer to an array (similarly to what slices are today). The ~[T] type
will continue to have move semantics, and you can borrow it to &[T] as usual.

The major difference between today's ~[T] type and a post-DST ~[T] is that the
push() method will be removed. There is no knowledge of a capacity in the
representation of a ~[T] value, so a push could not be supported at all. In
theory a pop() can be efficiently supported, but it will likely not be
implemented at first.

# [T]

As part of DST, the type grammar will start accepting [T] as a possible
substitute for type parameters. This basically means that if your type
parameters is &T, then &[U] can satisfy the type parameter.

While possible, I imagine that it will be rare for this to appear in apis. This
is an unsized type, which means that it's more limited what you can do with it
than you can with a sized type.

The full details of [T] will become apparent once DST is implemented, but it's
safe to say that APIs and usage should rarely have to deal with this type, and
it will likely be mostly transparent.

# Converting between Vec<T> and ~[T]

Conversions between these two types will be provided, and the default
implementations will be free. Converting from Vec<T> to ~[T] will be simply
forgetting the capacity, and converting from ~[T] to Vec<T> will set the
capacity to the length.

Helper methods will likely be provided to perform a forceful reallocating
shrink when going from Vec<T> to ~[T], but it will not be the default.

## The cost of Vec<T> => ~[T]

Some concerns have been brought up that this can in theory be a costly
transition under the assumption that this does a reallocation of memory to
shrink to the capacity to exactly the length. This will likely not be the
default implementation.

Some concerns have then been brought up that some allocators require the size
of the allocation to be passed to free(), and that this model is incompatible
with that flavor of allocator. We believe that this fear can be
alleviated with a "shrink if necessary" method on allocators. The default
allocator (backed by the system malloc) would be a no-op because the size to
free is not used. Allocators which use the size passed to free would actually
perform a reallocation.

# Choosing between Vec<T> and ~[T]

Primarily, if you need a growable vector, you should use Vec<T>. If you do not
need a growable vector, but you're instead just dealing with an array of items,
then you should use ~[T].

As a concrete example, I'll take the read_to_end() method on io's Reader trait.
This type must use a Vec<T> internally to read data into the vector, but it will
return a ~[T] because the contents are conceptually frozen after they have been
read.

There is no blanket right decision to choose between Vec<T> and ~[T], this will
need to be done on a case-by-case basis to evaluate whether apis should take or
consume Vec<T> or ~[T].

# Moving Forward

In order to implement DST, it is not necessary to remove all usage of ~[T]
today. It is necessary to remove all *growable* usage of ~[T], however. All uses
of vectors which need growable or shrinkable vectors need to switch to Vec<T>.
If a vector does not need to be grown or shrunk, it can remain as ~[T].

Concretely speaking, the next steps forward for ~[T] would entail:

* Add a Vec<T> -> ~[T] conversion. This will be an expensive conversion today
   because it requires an allocation (due to the layout of today's ~[T]), but it
   will not be expensive in the future.
* Add a ~[T] -> Vec conversion. Like the above step, this will also be
   expensive, but it will not be so in the future.
* Remove the `push` and `pop` families of methods from ~[T]


Hopefully that clears up any mystery surrounding what's happening with ~[T] and
Vec<T>! If you have any questions, feel free to respond to this email or to join
us in IRC.
_______________________________________________
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


_______________________________________________
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Reminder: ~[T] is not going away

Reply via email to