Because I cannot edit the wiki now, here my contributions
1. If the word is not found on the list, try to decompose it into
building blocks: prefixes, infixes, and suffixes, and other parts.
This can be done by trying to find word parts in a similar way as
replaceRunOnWords, i.e., by moving a space (but with a predefined
maximum of compound words probably > 2 but <= 4).
This should only apply for a selection of word parts; productive and
trustworthy parts.
Dutch has a compounding s, which is only added to certain parts; it
might be considered as an infix. (spelling+test, houding+s+test, but not
test+houdings)
All compounds have an optional hyphen position (spelling+-+test,
houding+s-+test) which should always be accepted.
On compounding points, checks should be possible to prevent
auto+onderdeel; auto+-+onderdeel is correct and should be accepted. This
is the case for all letters that would combine into one of the Dutch
2-letter-vowels (aa ae ai au ee ei eu ie oe oi oo ou ui uu ij).
Same for some equivalent words moeder+dochter+relatie should be
moeder+-+dochter+relatie or moeder+-+dochter+-+relatie.
(Hunspell has an issue here; it does not detect ehen the entire word is
specified)
1. We need to mark up incorrect compounds (words commonly mistaked)
with a FORBID tag after a standard separator.
2. We need to mark up prefixes (words that cannot occur on any other
position but as a prefix) with a PREFIX tag.
3. We need to mark up suffixes with a SUFFIX tag.
4. We need to mark up infixes with an INFIX tag.
Would this support 'moordantipersoneelsmijn'? moord is a word anti is a
in fact a prefix, personeel a word, s an infix(?), mijn a word
Hunspell has 2 compounding methods ( A B C D E compounds) and (first
middle middle middle last). The first is ideal for number generation
e.g.; the second more useful for normal words.
Ruud
On 09-05-13 15:57, Marcin Mi?kowski wrote:
W dniu 2013-05-08 19:46, Daniel Naber pisze:
Am 08.05.2013 19:34, schrieb Ruud Baars:
Daniel, maybe it is an idea to get a page somewhere to get the info
together on requirements for compounding in the spell checker.
Yes, feel free to create a page at http://wiki.languagetool.org/ and
link it from the Wiki's home page.
Here's my attempt to summarise the discussion:
http://wiki.languagetool.org/compounding-support-in-morfologikspeller
Feel free to modify it. (If you don't have an account for the wiki,
please mail me, I will add you as wiki author).
Regards,
Marcin
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel