Quoting Artavazd Mertarjyan <[EMAIL PROTECTED]>: > > Hi All! > > Thanks for detailed answers! > I agree that HunSpell is better then MySpell and I'm going to localize it > for Armenian too. > > In "hu" project CVS (2.0.1) the Hungarian language isn't defined as UTF-8. > Does that mean you are not using UTF-8 for Hungarian or you have another one > solution?
Hi, You can find the source of the Hungarian OOo 2.0.1 build on our build server: http://ftp.fsf.hu/OpenOffice.org_hu/2.0.1/ See hu_HU_u8.aff and hu_HU_u8.dic files in the http://ftp.fsf.hu/OpenOffice.org_hu/2.0.1/OOo_2.0.1_src_hu_additional.tar.gz file, and in the builds. > If you are using UTF-8 now, have you compare these two solutions, which is > faster? > > I've some doubt in score of HunSpell's UTF-8 text spell checker. > May be for Armenian it will be better to use the same algorithm in the > HunSpell? Unicode encoding has a little overhead. Using UTF-8 dictionary is slower on Hungarian texts by 10-20 percent (checks 80,000-90,000 words/s instead of 100,000 words/s on my machine). But I think, UTF-8 Armenian spell checking will be faster, as _8-bit_ Hungarian spell checking, because the bottle neck is the complexity of the morphology (the affix description) and the compound word support. Hungarian uses double suffix stripping plus compounding, and enough fast with UTF-8 encoding, too. We need the best spell checking and other Lingucomponent support for Armenian, too. Please, write (and make issues in the Issuezilla) about problems of Armenian OOo. For example, need for Armenian breakiterator patch, Armenian hyphenation with the special Armenian hyphen character, etc. UTF-8 encoding has already set in the Thesaurus component of OOo 2.0.1, thanks to the report of the Nepali developers. Best regards, Laci > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
