Am 06.08.2013 17:39, schrieb Marcin Miłkowski:

> (1) dictionaries should not be developed manually but generated from
> some database system (I know we do develop dictionaries manually for
> some languages but this should not be encouraged);

I agree. Trying lexeme_forge for German is on my TODO list. Please post 
tips about this if you can.

> (2) text files will inevitably consume more memory than the finite 
> state
> dictionary -- and the tagger will be slower;

The data is put in a HashMap, so it takes memory but it will be fast. 
The plain text dictionary shouldn't contain more than a few thousands 
words or so.

> BTW, after reading about all the problems with github, I'm beginning 
> to
> be skeptical if this is really worth the trouble. SVN at sf.net works
> most of the time anyway.

It was a lot of trial'n'error work but all major problems are solved 
now. I'm basically just waiting that at least one person clones the code 
(https://github.com/danielnaber/languagetool-test) and says that 
everything looks fine.

Regards
  Daniel

-- 
http://www.danielnaber.de

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to