On Fri, 15 Mar 2019 12:52:04 +0100
Hiltjo Posthuma <[email protected]> wrote:

Dear Hiltjo,

> I've applied both of the patches and a small change to the default
> worddelimiters.
> 
> Thanks for the clarifications. The codepoint assumption was indeed
> wrong.
> 
> I do not mind wchar_t, but in practise it is not consistent across
> platforms. However we already use wchar_t in st so it should be as
> correct as possible matching the POSIX standard.
> 
> (@Laslo) for simplicity/sanity sake I think assuming 1 codepoint is 1
> "character" makes sense.

yeah, this should work. As you know, I sometimes get carried away with
these things and the wrong assumption that 1 codepoint is 1 character
is still the common approach across the board with very few exceptions.
It is highly unlikely that one chooses รถ as a delimiter character and
then adds it as o + umlaut modifier to the delimiter string.

In this context, in my opinion, you made the right call, but in the long
run, if we start a utf+unicode library project, it should be done such
that these matters are reflected properly.

With best regards

Laslo

-- 
Laslo Hunhold <[email protected]>

Attachment: pgp4i1BIdYNiR.pgp
Description: PGP signature

Reply via email to