On 17/09/16 13:34, Moritz Lenz wrote:>> Searching further I found the ucd2c.pl program in the Moarvm tools >> directory. This generates the unicode_db.c somewhere else in the >> rakudo tree. I run this program myself on the Unicode 9.0.0 >> database and comparing the generated files shows many differences >> between the one in the rakudo tree and the generated one. > > Please make a rakudo spectest with those changes, and if it passes, > submit your patch as a pull request. Unicode support is more than just having the data from the text files in our own unicode database. In Unicode 9, the Zero Width Joiner is now explicitly supported for emoji. If we don't change the algorithm to create individual graphemes from streams of codepoints to consider this, we'll end up with improper support for 8 (because new stuff is in there) and improper support for 9 (because some stuff is missing) at the same time; i suspect that'll help nobody.
I expect Jnthn will do the full & proper update during the coming month, and running ucd2c.pl is the least time-consuming step of that, but perhaps a pull request for this is still welcome.