Am 07.05.2013 23:33, schrieb Marcin Miłkowski:

> Well, for me it seems to be the same issue still, as I haven't been
> given any reason to believe that hunspell expansion would not give me 
> a
> compounding mechanism for our speller (beyond the size of the word 
> list).

I see no reason other than the size of the list. As every noun can 
basically be combined with every other noun, you'll have 30,000^2 
combinations if there are 30,000 nouns. And as there are not only 
compounds made up of two words, you'd have another 30,000^3 words if you 
consider all three-part compounds.

But the way hunspell works can probably be mapped to an FSA. The 
hunspell compound tags of the words say:
* this is only a compound beginning, not a stand-alone word ("Arbeits" 
in German)
* this is only a compounds part, but not at the beginning (basically 
any noun but spelled lowercase, and a lot of other words)
* this is a noun that can both be used stand-alone, but also as a 
compound beginning (most nouns in German)

Actually the tags' meaning might be slightly different (didn't look 
them up now), but all if this can be, I think, expressed by interpreting 
and FSA that's built accordingly and without the need to generate a word 
list. A black list of "invalid" words is needed anyway.

I don't have time to dig into this now, but could write test cases etc. 
So let me know if I can help with that.

Regards
  Daniel

-- 
http://www.danielnaber.de

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to