Re: Unicode handling comparison

Dmitry Olshansky Wed, 27 Nov 2013 12:15:47 -0800

27-Nov-2013 22:12, H. S. Teoh пишет:

On Wed, Nov 27, 2013 at 10:07:43AM -0800, Andrei Alexandrescu wrote:

On 11/27/13 7:43 AM, Jakob Ovrum wrote:

On that note, I tried to use std.uni to write a simple example of how
to correctly handle this in D, but it became apparent that std.uni
should expose something like `byGrapheme` which lazily transforms a
range of code points to a range of graphemes (probably needs a
`byCodePoint` to do the converse too). The two extant grapheme
functions, `decodeGrapheme` and `graphemeStride`, are *awful* for
string manipulation (granted, they are probably perfect for text
rendering).


Yah, byGrapheme would be a great addition.

[...]

+1. This is better than the GraphemeString / i18nString proposal
elsewhere in this thread, because it discourages people from using
graphemes (poor performance) unless where actually necessary.


I could have sworn we had byGrapheme somewhere, well apparently not :(

BTW I believe that GraphemeString could still be a valuable addition. Iknown of at least one good implementation that gives you O(1) graphemeaccess with nice memory footprint numbers. It has many benefits but thechief problem with it:a) It doesn't at all solve the interchange at all - you'd have to encodeon write/re-code on readb) It relies on having global shared state across the whole program, andthat's the real show-stopper thing about it


In any case it's a direction well worth exploring.



--
Dmitry Olshansky

Re: Unicode handling comparison

Reply via email to