Tim Rühsen <tim.rueh...@gmx.de> writes: > On Dienstag, 27. Dezember 2016 12:26:44 CET Simon Josefsson wrote: >> Now that Tim implemented TR46 and there is a release with it out, I'm >> pondering some next steps for libidn2, which may include: >> >> * Better APIs to simplify conversion for applications >> - Compare libidn APIs that take various string forms. >> - In particular, the API should take entire domain names >> instead of only labels. >> - Also in particular, there should be an API for decoding. >> * Other language bindings? Compare libidn. >> >> More ideas are welcome! >> >> Continous integration on gitlab would be nice. It was some time since I >> played with it last time... things have likely changed. > > * Reducing static table size (tr46map.*) > - using a trie for idna_map would allow compact storing of codepoints / > codepoint ranges and still having a fast access/search > - detect and eliminate doublettes in mapdata > - compact storing mapdata > > By 'compact storing' I think of the usage of a continuation bit: > 1 byte: 0-0x7f -> 0xxxxxxx > 2 bytes: 0x80-0x3fff ->1xxxxxxx 0xxxxxxx > 3 bytes: 0x4000-0x1fffff ->1xxxxxxx 1xxxxxxx 0xxxxxxx > 4 bytes: 0x200000-0xFFFFFFF -> 1xxxxxxx 1xxxxxxx 1xxxxxxx 0xxxxxxx > 5 bytes: 0x10000000->0xFFFFFFFF -> 1xxxxxxx 1xxxxxxx 1xxxxxxx 1xxxxxxx > 0xxxxxxx
Yeah, there is a lot of room for optimization in a library like this -- it should be relatively easy to try different storage algorithms and benchmark them to see what works best in reality. As you've noticed, I've taken a really naïve approach initially to get simple (but slow) code. It is not always clear beforehand what results in better performance or not. /Simon
signature.asc
Description: PGP signature
_______________________________________________ Help-libidn mailing list Help-libidn@gnu.org https://lists.gnu.org/mailman/listinfo/help-libidn