On 11-12-23 12:03 PM, Marijn Haverbeke wrote:
I'm also curious what people think are "the important parts" of unicode.
Character classification is very important, and should be in core I
think (if only to encourage people to actually use it instead of
rolling their own... badly).
Yeah. I looked at ways of doing a minimalist build of libicu today and
it just gets really, really gross. Also it's a lot of layers of
indirection for something that ought to be pretty fast (core lexing
routines and such). So I just did a python-conversion monstrosity into
rust code. Adds about 80kb to libcore optimized and gets us the general
categories and a couple important derived properties (XID_Start /
Continue, Alphabetic).
Encodings are something people will occasionally need, but a much less
important thing. This doesn't have to be in core, I think. (And, if I
understand correctly, much of libicu is encoding tables.)
Agreed. I think it's fine if we keep this stuff in a "full" binding to
libicu outside core. I'll keep updating the char API and munged unicode
data tables as needed, but this seems semi-workable.
(We'll probably need NFKC and a couple other bits in core, but hopefully
not *too* much.)
-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev