Daniel, maybe it is an idea to get a page somewhere to get the info 
together on requirements for compounding in the spell checker.

A wiki page?

I have been contributing a lot of ideas to the most recent version of 
Hunspell, required things for good Dutch support.

Ruud

On 08-05-13 18:08, Daniel Naber wrote:
> Am 07.05.2013 23:33, schrieb Marcin Miłkowski:
>
>> Well, for me it seems to be the same issue still, as I haven't been
>> given any reason to believe that hunspell expansion would not give me
>> a
>> compounding mechanism for our speller (beyond the size of the word
>> list).
> I see no reason other than the size of the list. As every noun can
> basically be combined with every other noun, you'll have 30,000^2
> combinations if there are 30,000 nouns. And as there are not only
> compounds made up of two words, you'd have another 30,000^3 words if you
> consider all three-part compounds.
>
> But the way hunspell works can probably be mapped to an FSA. The
> hunspell compound tags of the words say:
> * this is only a compound beginning, not a stand-alone word ("Arbeits"
> in German)
> * this is only a compounds part, but not at the beginning (basically
> any noun but spelled lowercase, and a lot of other words)
> * this is a noun that can both be used stand-alone, but also as a
> compound beginning (most nouns in German)
>
> Actually the tags' meaning might be slightly different (didn't look
> them up now), but all if this can be, I think, expressed by interpreting
> and FSA that's built accordingly and without the need to generate a word
> list. A black list of "invalid" words is needed anyway.
>
> I don't have time to dig into this now, but could write test cases etc.
> So let me know if I can help with that.
>
> Regards
>    Daniel
>


------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to