On Fri, 2 Oct 2020 18:10:21 +0200
Richard Ipsum <[email protected]> wrote:

Dear Richard,

> I'm happy to drop this patch from the series but libgrapheme isn't in
> sbase's tree and it doesn't seem reasonable to expect users of sbase
> to install libgrapheme themselves?

this is a good point and I think you should not drop this patch. I see
it like this: sbase has an out-of-tree-copy of libutf in its
repository, and in regard to grapheme cluster handling, nothing happens
between the unicode versions.

One can think about pulling in libgrapheme, porting the
Unicode-sections piece by piece and then dropping libutf. Admittedly,
libgrapheme does not have all of the high-level-functions, but in many
cases they are trivial to replace or not necessary.

One example here are the, well-meant, is*rune() functions, which for
instance are used in tr(1) to map classes. However, the more I think
about it, they are just insufficient to map grapheme clusters (because
they operate on codepoints only). To give an example, if you have a
grapheme cluster of multiple characters, where one is a digit rune, but
the others aren't, does this mean that the entire grapheme cluster
should be dropped when tr(1) is configured to do so? The same
also applies to upperrune.

> I'm not at all familiar with libgrapheme either and I don't know what
> the trivial counterexamples Laslo refers to are, maybe it's better if
> he takes over this part of the fix?

No matter what Michael decides to do (i.e. import libgrapheme and port
or stay with libutf), your patch makes sense, because importing
libgrapheme would not mean immediately dropping libtuf (and it would be
more of a porting process) and they would stay side-by-side for a while.

With best regards

Laslo

Reply via email to