Ben Morgan wrote:
On Wed, Jan 21, 2009 at 12:41 AM, DM Smith <dmsmith...@yahoo.com <mailto:dmsmith...@yahoo.com>> wrote:


    ICU has the notion of a collation key, which can be used for such a
    purpose. (I think we've gotten to the point where ICU is a
    requirement for UTF-8 modules.) In ICU, the collation key is locale
    dependent. (For example, Germans sort accent marks differently than
    French. In Spanish dictionaries, at least older ones, ch come before
    ca.) I really don't see any way around having a static collation for
    a module. If so, the collation would need to be fixed wrt either a
    fixed locale or a locale based upon the language of the module.

DM's suggestion (not merely the part pertaining to ICU) sounds good to me. It does represent a rather radical change since it's a proposal for a whole new driver type, but that might be what we need in order to get the kind of flexibility we need going forward.

ICU is not a requirement for using UTF-8 modules; rather than use ICU, most frontends (certainly BPBible, GnomeSword, BibleTime and I think MacSword as well) have defined their own string manager code (generally using the platform - qt, glib or python).

DM is really correct that we're coming to the point where ICU is going to be a necessity for app i18n/l10n. ICU provides up-to-date collation and normalization facilities that are a necessity for correctly managing Unicode data in anything other than a braindead manner (like our byte-ordered LD entries currently are). Searching, including functions like accent normalization and correct case folding, aren't possible without certain level of Unicode knowledge within the app. And when we actually think about doing lookup via transliteration (something every other piece of professional Bible software handles) we can either go to the effort of rolling our own transliteration facility or use the ready-made one provided in ICU (as Logos does).

MacSword may be exempt from needing ICU for a while, as would any other MacOS or iPhone program, for the simple fact that many of ICUs functionality should be available through platform APIs. That's because Apple has included ICU on both of these platforms, though it won't ever be the most recent release and may lack some data.

Personally, BPBible doesn't use ICU for two reasons - the extra size for ICU and the transliterators it supplies. When compiling with ICU, it adds transliteration filters, which are really buggy - crashes, mixed up xml, etc.

The extra download size added by ICU data is 3mb, less than the size of 2 Bibles. In 2009, I can't see anyone complaining about a 3mb increase in download size. Even PDAs and cell phones are shipping with gigs of memory.

Regarding stability of the transliterators, I've just disabled all but the primary Latin transliterators, which should eliminate most problems. If problems remain, please let us know (preferably via the bug tracker). We can add some of the other Latin-oriented transliterators back at a later date, once we've checked them and established their stability.

Put simply, complete i18n and l10n of Sword and Sword frontends aren't within our reach without ICU.

--Chris

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to