On 11-12-23 12:03 PM, Marijn Haverbeke wrote:
I'm also curious what people think are "the important parts" of unicode.

Character classification is very important, and should be in core I
think (if only to encourage people to actually use it instead of
rolling their own... badly).

Yeah. I looked at ways of doing a minimalist build of libicu today and it just gets really, really gross. Also it's a lot of layers of indirection for something that ought to be pretty fast (core lexing routines and such). So I just did a python-conversion monstrosity into rust code. Adds about 80kb to libcore optimized and gets us the general categories and a couple important derived properties (XID_Start / Continue, Alphabetic).

Encodings are something people will occasionally need, but a much less
important thing. This doesn't have to be in core, I think. (And, if I
understand correctly, much of libicu is encoding tables.)

Agreed. I think it's fine if we keep this stuff in a "full" binding to libicu outside core. I'll keep updating the char API and munged unicode data tables as needed, but this seems semi-workable.

(We'll probably need NFKC and a couple other bits in core, but hopefully not *too* much.)

-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to