Hi,

I would like MySpell to go completely away (and soon). I simply do not have the time to maintain it properly and Hunspell can do everything that MySpell does and much more.

So IMHO, we should remove MySpell completely from the source tree when Hunspell is integrated.

That will remove the duplication and prevent confusion. Then we would have NO Myspell vs HunSpell issues in dictionary.lst

Kevin

On Dec 23, 2005, at 6:54 AM, [EMAIL PROTECTED] wrote:

Quoting Simon Brouwer <[EMAIL PROTECTED]>:

Hi Nemeth,

[EMAIL PROTECTED] wrote:

Hi Artavazd,

You can use your patch for Armenian OOo 2.0, but using Hunspell
(really extended MySpell) is a general solution for encoding problems.

Hunspell integration is targeted to OOo 2.0.2 (end of february 2006),


Does that mean we have to modify the format of the existing Myspell
dictionaries?

Hi Simon,

No, Hunspell is back compatible with MySpell. Dmitri, thanks for the
answer! Hunspell supports NOSPLITSUGS. I strongly think, Hunspell can
help in handling of Dutch compound words. (By the way, I have a little
Christmas surprise for Dutch users of OOo. I hope, I can post on the
weekend. :)


Or is it possible to use different spell checkers, e.g. if there is more
than one language in a document,
one language might be checked using Hunspell and another using Myspell.

Björn Jacke has suggested a dictionary.lst syntax to differentiate
MySpell and Hunspell dictionaries (because German Hunspell dictionary
uses new features of Hunspell, and it don't work well with MySpell).
But new versions of Hunspell could have also new features, so I think, we need
only a policy for downloadable OOo dictionaries. It's enough, that
DictOOo always supports the spell checker version of the last stable version of OOo. (Localised versions of OOo can contain newer spell checking dictionaries
with a newer Hunspell or other spell checkers.)


The right tokenization comes from the OOo's breakiterator.
If the default tokenization is bad for Armenian, you need a Breakiterator patch. (See i18npool/source/breakiterator/ and its data/ subdirectory).


Will the different behaviour of the breakiterator be effective on all
the languages in the document, or
can it also be switched depending on the language?

I have suggested language specific breakiterator patches, like
the Catalan, Hungarian etc. dict_word patches in
i18npool/source/breakiterator/data directory.


For Dutch spell checking, it would be preferable if the break iterator
could be instructed not to break
on hyphens, because the new Dutch spelling introduces are Dutch words
that include a hyphen, of
which not all parts are also valid words (example:
"arbeidsre-integratie", in which "arbeidsre" is not a Dutch word).

Similar to Hungarian. See i18npool/source/breakiterator/data/ dict_word_hu
(the new version of dict_word_hu includes also the n-dash as word
character).

Best regards,

Laci


--
Vriendelijke groet,
Simon Brouwer.

### nl.openoffice.org ###


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: dev- [EMAIL PROTECTED]






----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: dev- [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to