On Fri, 2 Oct 2020 18:10:21 +0200 Richard Ipsum <[email protected]> wrote:
Dear Richard, > I'm happy to drop this patch from the series but libgrapheme isn't in > sbase's tree and it doesn't seem reasonable to expect users of sbase > to install libgrapheme themselves? this is a good point and I think you should not drop this patch. I see it like this: sbase has an out-of-tree-copy of libutf in its repository, and in regard to grapheme cluster handling, nothing happens between the unicode versions. One can think about pulling in libgrapheme, porting the Unicode-sections piece by piece and then dropping libutf. Admittedly, libgrapheme does not have all of the high-level-functions, but in many cases they are trivial to replace or not necessary. One example here are the, well-meant, is*rune() functions, which for instance are used in tr(1) to map classes. However, the more I think about it, they are just insufficient to map grapheme clusters (because they operate on codepoints only). To give an example, if you have a grapheme cluster of multiple characters, where one is a digit rune, but the others aren't, does this mean that the entire grapheme cluster should be dropped when tr(1) is configured to do so? The same also applies to upperrune. > I'm not at all familiar with libgrapheme either and I don't know what > the trivial counterexamples Laslo refers to are, maybe it's better if > he takes over this part of the fix? No matter what Michael decides to do (i.e. import libgrapheme and port or stay with libutf), your patch makes sense, because importing libgrapheme would not mean immediately dropping libtuf (and it would be more of a porting process) and they would stay side-by-side for a while. With best regards Laslo
