24-Aug-2014 21:59, Andrei Alexandrescu пишет:
On 8/24/14, 6:16 AM, Dmitry Olshansky wrote:
24-Aug-2014 16:24, Andrei Alexandrescu пишет:
Speaking of data-structures I find just about the opposite. Most data
structure are small, which must be the fact so fondly used by C++
vector: small-string optimization. Only very few data-structures are
large in a given program, and usually correspond to some global tables
and repositories. Others are either short lived byproduct of input
processing or are small data-sets attached to some global entity.
I don't know of any std::vector that uses the small string optimization.
This time it's me who must be wrong.
Yet I see that this is recognized need:
https://github.com/facebook/folly/blob/master/folly/small_vector.h
LLVM folks seem to do the same.
With that in mind small containers might be better as a special value type.
std::string does ubiquitously because (a) strings are often handled as
values, and (b) C++11 put refcounted strings into illegality (forced
mistake) therefore robbing implementers of an important optimization.
In a way both C++ and D got it "wrong". Arrays/containers are entity
types - they have identity and should be manipulated most often by
reference. Presence of pass-by-value of containers in C++ programs, save
for rvalue optimization purposes, is suspicious.
Agreed.
In contrast, strings
are value types - they are handled most often as a unit and passed by
value, just like e.g. numbers.
Indeed, just note that it would be real nice not to actually copy
strings. In fact it seems that copy-on-write (with ref-counting) for big
strings and small string optimization for small is almost ideal
solution. In D we could have non-atomic (thread-local) copy-on-write,
which should be quite fast.
C++ made both containers and strings value types, so it needs forever to
look over its shoulder about n00bs copying large containers unwittingly.
It also does a fair amount of unneeded string copying, and optimizing
string-based C++ code is nontrivial. D made both arrays and strings
slices, a data structure made highly expressive by the garbage collector
but that occasionally confuses people. With std.refcounted.RCString and
std.container.Array we get both abstractions "right".
Good points.
--
Dmitry Olshansky