W dniu 2013-08-04 16:04, Daniel Naber pisze:
> Am 04.08.2013 15:53, schrieb Andriy Rysin:
>
>> Is there a problem with binary file history getting too big? I am
>> still actively developing Ukrainian dictionary so there will be a lot
>> of binary commits
>
> Yes, but have a look at GermanTagger: it uses a text file called
> added.txt that is preferred over the binary file. The file's content
> will only be merged to the binary file when it gets too big (not
> necessarily for every release). We should move that feature to the
> BaseTagger so every language can make use of it.

I don't consider this as a feature but more as a dirty hack because:

(1) dictionaries should not be developed manually but generated from 
some database system (I know we do develop dictionaries manually for 
some languages but this should not be encouraged);

(2) text files will inevitably consume more memory than the finite state 
dictionary -- and the tagger will be slower;

(3) for morphologically rich languages, a text file is simply not an option.

BTW, after reading about all the problems with github, I'm beginning to 
be skeptical if this is really worth the trouble. SVN at sf.net works 
most of the time anyway.

Regards,
Marcin

------------------------------------------------------------------------------
Get your SQL database under version control now!
Version control is standard for application code, but databases havent 
caught up. So what steps can you take to put your SQL databases under 
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to