> Talking of which, will sword be able to use stock ICU 2.0 or are there > still specific bits > you need to add. If so is there anything we can do to split those off into > a separate > library?
It looks like there is some hope that we can ship just those portions of data that we need in addition to the stock data. It looks like this is definitely possible for locales and converters (neither of which we use) but perhaps not transliterators (which we do use). The data, in its compiled form, is essentially platform independent except for endianness. So if we can supplement the stock data, we could supply pre-compiled big-, little-, and EBCDIC big-endian data. I will look closer at this issue as soon as I have finished writing some new transliterators for those remaining scripts that aren't supported by stock ICU. Here's the ICU User's Guide page about the data, if you want to try deciphering it yourself: http://www-124.ibm.com/icu/userguide/icudata.html. One caveat though, ICU 2.0 seems to be very buggy and the docs are quite often wrong. :) --Chris
