Quoting Simon Brouwer <[EMAIL PROTECTED]>: > Hi Nemeth, > > [EMAIL PROTECTED] wrote: > > >Hi Artavazd, > > > >You can use your patch for Armenian OOo 2.0, but using Hunspell > >(really extended MySpell) is a general solution for encoding problems. > > > >Hunspell integration is targeted to OOo 2.0.2 (end of february 2006), > > > > > Does that mean we have to modify the format of the existing Myspell > dictionaries?
Hi Simon, No, Hunspell is back compatible with MySpell. Dmitri, thanks for the answer! Hunspell supports NOSPLITSUGS. I strongly think, Hunspell can help in handling of Dutch compound words. (By the way, I have a little Christmas surprise for Dutch users of OOo. I hope, I can post on the weekend. :) > > Or is it possible to use different spell checkers, e.g. if there is more > than one language in a document, > one language might be checked using Hunspell and another using Myspell. Björn Jacke has suggested a dictionary.lst syntax to differentiate MySpell and Hunspell dictionaries (because German Hunspell dictionary uses new features of Hunspell, and it don't work well with MySpell). But new versions of Hunspell could have also new features, so I think, we need only a policy for downloadable OOo dictionaries. It's enough, that DictOOo always supports the spell checker version of the last stable version of OOo. (Localised versions of OOo can contain newer spell checking dictionaries with a newer Hunspell or other spell checkers.) > > >The right tokenization comes from the OOo's breakiterator. > >If the default tokenization is bad for Armenian, you need a Breakiterator > >patch. (See i18npool/source/breakiterator/ and its data/ subdirectory). > > > > > Will the different behaviour of the breakiterator be effective on all > the languages in the document, or > can it also be switched depending on the language? I have suggested language specific breakiterator patches, like the Catalan, Hungarian etc. dict_word patches in i18npool/source/breakiterator/data directory. > > For Dutch spell checking, it would be preferable if the break iterator > could be instructed not to break > on hyphens, because the new Dutch spelling introduces are Dutch words > that include a hyphen, of > which not all parts are also valid words (example: > "arbeidsre-integratie", in which "arbeidsre" is not a Dutch word). Similar to Hungarian. See i18npool/source/breakiterator/data/dict_word_hu (the new version of dict_word_hu includes also the n-dash as word character). Best regards, Laci > > -- > Vriendelijke groet, > Simon Brouwer. > > ### nl.openoffice.org ### > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
