On 17/09/16 13:34, Moritz Lenz wrote:>> Searching further I found the
ucd2c.pl program in the Moarvm tools >> directory. This generates the
unicode_db.c somewhere else in the >> rakudo tree. I run this program
myself on the Unicode 9.0.0 >> database and comparing the generated
files shows many differences >> between the one in the rakudo tree and
the generated one. > > Please make a rakudo spectest with those changes,
and if it passes, > submit your patch as a pull request.
Unicode support is more than just having the data from the text files in
our own unicode database. In Unicode 9, the Zero Width Joiner is now
explicitly supported for emoji. If we don't change the algorithm to
create individual graphemes from streams of codepoints to consider this,
we'll end up with improper support for 8 (because new stuff is in there)
and improper support for 9 (because some stuff is missing) at the same
time; i suspect that'll help nobody.
I expect Jnthn will do the full & proper update during the coming month,
and running ucd2c.pl is the least time-consuming step of that, but
perhaps a pull request for this is still welcome.