On Sonntag 18 Dezember 2005 17:45, Laurent Godard wrote: Hi Laurent,
> Btw, if i undertsand correctly, you need an external tagger > Are your taggers still based on know tagged text ? The English and the German tagger work quite differently: the English one is trained on a corpus (not by me, it's done by OpenNLP) and uses context to decide which tag to assign to a word. The German tagger doesn't use context but assigns all possible tags to a word. For example, "das Haus" (the house) is a correct phrase because "das" is neutrum/singular/nominativ, "Haus" is neutrum/singular/nominativ too. "des Haus" is incorrect because "des" is neutrum/singular/genitiv. In other words, you need at least one reading to match in gender, number, and case. So you need a large list of words with all their morphological information. > last point, why did you use Java :( Python is great and your first > approach was perfect for multiple use (not only dedicated to OOo) But my first approach also quickly became un-maintainable even for myself. Now the code has a better structure, more unit test coverage, can be build with an ant script and is easy to work with in Eclipse. I really need Java's type-safety and the power of Eclipse to work effectively. Regards Daniel -- http://www.danielnaber.de --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
