Re: Today's programming challenge - How's your Range-Fu ?

Panke via Digitalmars-d Sat, 18 Apr 2015 01:30:33 -0700

On Saturday, 18 April 2015 at 08:18:46 UTC, Walter Bright wrote:

On 4/18/2015 12:58 AM, John Colvin wrote:
On Friday, 17 April 2015 at 18:41:59 UTC, Walter Bright wrote:
On 4/17/2015 9:59 AM, H. S. Teoh via Digitalmars-d wrote:
So either you have to throw out all pretenses ofUnicode-correctness andjust stick with ASCII-style per-character line-wrapping, oryou have tolive with byGrapheme with all the complexity that itentails. The formeris quite easy to write -- I could throw it together in acouple o' hoursmax, but the latter is a pretty big project (cf. Unicodeline-breaking
algorithm, which is one of the TR's).
It'd be good enough to duplicate the existing behavior, whichis to treat
decoded unicode characters as one column.
Code points aren't equivalent to characters. They're not thesame thing in most
European languages,
I know a bit of German, for what characters is that not true?

Umlauts, if combined characters are used. Also words that stillhave their accents left after import from foreign languages. E.g.Café

Getting all unicode correct seems a daunting task with a severeperformance impact, esp. if we need to assume that a string mighthave any normalization form or none at all.


See also: http://unicode.org/reports/tr15/#Norm_Forms

Re: Today's programming challenge - How's your Range-Fu ?

Reply via email to