On Sunday, 7 September 2014 at 10:29:41 UTC, ketmar via Digitalmars-d wrote:
index nth symbol! ucs-4 (aka dchar/dstring) is ok though.

For western text strings utf-8 is much better due to cache efficiency. You can speed it up using SSE or dedicated datastructures.

The point of having unique immutable strings is that they compare by reference only and that you can have auxillary datastructures that classify them if needed.

I think the D approach to strings is unpleasant. You should not have slices of strings, only slices of ubyte arrays.

If you want real speedups for streams of symbols you have to move into the landscape of huffman-encoding, tries, dedicated datastructures…

Having uniform string support in libraries (i.e. only supporting utf-8) is a clear advantage IMO, that will allow for APIs that are SSE backed and performant.

Reply via email to