On Thu, Oct 26, 2017 at 04:47:35PM +0200, Daiki Ueno wrote: > Daiki Ueno <[email protected]> writes: > > > I have been trying to update libunistring to Unicode 9.0.0. Initially I > > planned it for the end of this month, but now I'm almost giving up, > > because of the recent additions to the UAX #29 algorithms: > > > > - The 3 rules added to the Grapheme Cluster Boundary Rules, namely > > (GB10, GB12, GB13), involve 3 consequent characters, while the current > > API uc_is_grapheme_break() only takes 2 characters > > > > - The similar rules are also added to the Word Boundary Rules. Though > > it wouldn't be a problem as uniwbrk.h doesn't expose such API, the > > implementation of WB15 and WB16 could be complicated because it > > requires lookahead of a next character > > As I had some time this week, I resumed this work. Thanks to the help > of my colleagues, the above new rules involving 3 or more characters are > now implemented without breaking the ABI. > > For the Grapheme Cluster Boundary rules, u*_grapheme_breaks have been > rewritten to be more generic, taking into account of the entire > sequence. The other API functions are still kept, but have limitations > due to the number of arguments. > > Bruno, Ben, could you take a look at the attached patch, when you have > time?
I'm impressed. I have not looked carefully at the whole patch. That is partly because of my time constraints, but it is also partly because I get patch rejects when I apply the patch to the tip of master for gnulib. To what commit should I apply the patch? Thanks, Ben.
