Re: D's confusing strings (was Re: D on hackernews)

Andrei Alexandrescu Wed, 21 Sep 2011 13:55:44 -0700

On 9/21/11 3:26 PM, Christophe Travert wrote:

Andrei Alexandrescu , dans le message (digitalmars.D:144936), a écrit :

On 9/21/11 1:20 PM, Christophe Travert wrote:

Dealing with utfencoded strings is less efficient, but there is a number
of algorithms that can be optimized for utfencoded strings, like copying
or finding an ascii char in a string. Unfortunately, there is no
practical way to do this with the current range API.


I'd love to hear more about that. The standard library does optimize
certain algorithms for UTF strings.



Well, in that other thread called "Re: toUTFz and WinAPI
GetTextExtentPoint32W/" in D.learn (what is the proper way to refer to
a message here ?), I showed how to improve walkLength for strings and
utf.stride.


Interesting, thanks.

About finding a character in a string, rather than relying
on string.popFront, which makes the loop un-unrollable,
we could search code unit per code unit directly. This is obviously
better for ascii char, and I'll be looking for a nice idea for other
code points (besides using find(Range, Range)).

I didn't review phobos with that idea in mind, and didn't do any
benchmark exept the one for walkLength, but using string.popFront is a
bad idea in term of performance, so work-arrounds are often better, and
they are not that hard to find. I may do that when I have more time to
give to D.


That sounds great. Looking forward to your pull requests!

Andrei

Re: D's confusing strings (was Re: D on hackernews)

Reply via email to