Hi Laslo, On Sun, Oct 02, 2022 at 02:37:12AM +0200, Laslo Hunhold wrote: > On Sun, 2 Oct 2022 02:01:34 +0200 (CEST) > [email protected] wrote: > > > commit dbc8751dc6c034967d2b3133a58a627834992e8c > > Author: Jan Klemkow <[email protected]> > > AuthorDate: Sun Oct 2 00:59:19 2022 +0200 > > Commit: Jan Klemkow <[email protected]> > > CommitDate: Sun Oct 2 01:00:03 2022 +0200 > > > > use libgrapheme instead of libutf > > thanks for putting forward the trust and using libgrapheme for your > application!
This task was on my list for some time. I were just to lazy to do it, till now :) > I am currently in the process of heavy refactorization in preparation > of version 2 (I want to put the code on much more formally-verifiable > fotting), but version 1 is generally stable and there are no known > bugs. I just ported libgraphme-1 to OpenBSD. I already have an OK to commits this after the 7.2 release [1]. Thus, libgrapheme will be available in OpenBSD 7.3. If you release a newer version, I will update the port. [1]: https://marc.info/?l=openbsd-ports&m=166409311518680&w=2 > Until now, I refactored the case-, character- and line-functions and > they are working perfectly, which the unit-tests reflect. The word- and > sentence-functions have more complex state-handling that requires me to > think of a fitting data-structure, and I know of some edge-cases where > they might fail (e.g. NUL-terminated strings) given the iffy > index-jiggling. > > But given you're only using character-break-checks, you're safe. There > will however be an API-change with version 2 where > > grapheme_next_character_break() > > is renamed to > > grapheme_next_character_break_utf8(). > > I know that such changes are always a bad thing and I gave it a lot of > thought, but it's better to change now, where only very few projects > use the library, instead of having to carry this as legacy cruft into > the future. I don't worry about an API change. But, why do make the function names so long? And why do you extend with "_utf8"? Function names in C are much shorter in general. For instance, grph_nxt_char_brk() would be more handy to use. bye, Jan
