Re: Kinds of containers

Timon Gehr via Digitalmars-d Wed, 21 Oct 2015 07:27:08 -0700

On 10/21/2015 01:05 PM, Andrei Alexandrescu wrote:

I'm finally getting the cycles to get to thinking about Design by
Introspection containers. First off, there are several general
categories of containers. I think D should support all properly. One
question is which we go for in the standard library.


1. Functional containers.

These are immutable; once created, neither their topology nor their
elements may be observably changed. Manipulating a container entails
creating an entire new container, often based on an existing container
(e.g. append takes a container and an element and creates a whole new
container).

Internally, functional containers take advantage of common substructure
and immutability to share actual data. The classic resource for defining
and implementing functional containers is
http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.

...

I still think those should be mutable by default in order to havepainless interchangeability with other value type containers. Why shouldcorresponding ephemeral and persistent containers have different interfaces?


I assume you envision code using those to look as follows?

FunSet!int a;
a=a.with_(1);
auto b=a;
a=a.with_(2);
a=a.with_(3);
// b = {1}, a={1,2,3};

I think this should be allowed too:

FunSet!int a;
a.insert(1);
auto b=a;
a.insert(2);
a.insert(3);
// b = {1}, a={1,2,3};

One of the two versions should be automatically implemented via UFCS.

2. Reference containers.

These have classic reference semantics (à la Java). Internally, they may be 
implemented either as class objects or as reference counted structs.

They're by default mutable. Qualifiers should apply to them gracefully.

3. Eager value containers.

These are STL-style. Somewhat surprisingly I think these are the worst of the 
pack; they expensively duplicate at the drop of a hat and need to be carefully 
passed around by reference lest performance silently drops. Nevertheless, when 
used as members inside other data structures value semantics might be the 
appropriate choice. Also, thinking of them as values often makes code simpler.

By default eager value containers are mutable. They should support immutable 
and const meaningfully.


4. Copy-on-write containers.

These combine the advantages of value and reference containers: you get
to think of them as values, yet they're not expensive to copy. Copying
only occurs by necessity upon the first attempt to change them.
...

IMO "1." ought to combine the advantages of value and referencecontainers as well, just without any expensive copying at all, even whenupdates happen.

The disadvantage is implementations get somewhat complicated. Also, they
are shunned in C++ because there is no proper support for COW; for
example, COW strings have been banned starting with C++11 which is quite
the bummer.

Together with Scott Meyers, Walter figured out a way to change D to
support COW properly. The language change consists of two attributes.

=======

I'll attempt to implement a few versions of each and see what they look
like. The question here is what containers are of interest for D's
standard library.
...


List:
 - forward iteration
 - bidirectional iteration

Stack:
 - basic stack
 - ordered stack [0]

Queue:
 - basic queue
 - heap

Set:
- hash set
- ordered set
- accumulating set [1]
- trie/radix tree
+ multiset versions


Map:
- hash map
- ordered map
- accumulating map [1]
- accumulating map with range update [2]
- trie/radix tree
- accumulating trie [1]
- accumulating trie with range update [2]
+ multimap versions
- array
- accumulating array [1]
- accumulating array with range update [2]
+ O(1) reset versions [3]
- rope
- accumulating rope [1]
- accumulating rope with range update [2]

[0] ordered stack: Push operations automatically pop the minimal numberof elements off the stack prior to pushing, such as to guarantee thatthe elements on the stack remain ordered. The stack should expose asorted range in order to support binary search.

[1] accumulation: The (ordered!) data structure allows fast queries forthe result of some binary associative operations on the elements in acertain range. (the allowed operations are determined in advance andsome intermediate results are automatically maintained). (for map, theoperation would be just on the values, not on the keys.) This is usuallyquite easy support, but very useful.

[2] range update: here, the idea is that the data structure allows allelements in a certain range to be updated. the updates are performedlazily and have to be compatible with the associative operations (if any).

[3] fast reset: here the idea is that the map allows fast reset of itsvalues at the cost of some small additional overhead per lookup.(destructors are called lazily.)

Re: Kinds of containers

Reply via email to