Hi Zeno, See below ...
----- [EMAIL PROTECTED] wrote: > as i know Koha 3.0 it will base the opac on Zebra for the largest > sites. But as I see here > http://lists.indexdata.dk/pipermail/zebralist/2007-May/001522.html, > Zebra has same limit on a full support of UTF-8. > > Viewing the problem for Koha, where are the limits ? > Can I input data in Latin, Arabic, Chinese, etc scripts and search > them ? > > With a mix of input scripts do you seggest to use Koha 3.0 without > Zebra ? There are plenty of examples of folks using Zebra to manage non-latin-1 languages - for instance, greek + english russian + english scandinavian languages + english turkish + english However, it is currently not possible to index more than two-three of these simultanous in the same document corpus, as there is a hard restriction on 256 indexable chars available. The Index Data folks are in the process of integrating the ICU Unicode libraries into Zebra, which will give Zebra the capability to index the full UTF-8 character set in a single document corpus, with no restriction on indexable characters. The ICU UFT-8 integration work will provide character normalization and tokenization over the full UTF-8 range of characters, but it may not provide tokenization of languages like Japanese and Korean, that may take a deep linguistic knowledge of the language and could be a lifetime study in itself. That said, it should minimally provide support for languages that use whitespace as the word separator. Note that in Koha, we can do some stemming, synonym expansion, and article removal/stopword creation pre-index and pre-search, for the languages that aren't directly supported in Zebra. Hope that answers your question without getting too technical ;-) Cheers, -- Joshua Ferraro SUPPORT FOR OPEN-SOURCE SOFTWARE President, Technology migration, training, maintenance, support LibLime Featuring Koha Open-Source ILS [EMAIL PROTECTED] |Full Demos at http://liblime.com/koha |1(888)KohaILS _______________________________________________ Koha-devel mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/koha-devel
