On Thu, 2015-11-12 at 15:43 +0000, Russ, Daniel (NIH/CIT) [E] wrote:
> 1) I use the old sourceforge models.  I find that the source of error
> in my analysis are usually not do to mistakes in sentence detection or
> POS tagging.  I don’t have the annotated data or the time/money to
> build custom models.  Yes, the text I analyze is quite different than
> the (WSJ? or what corpus was used to build the models), but it is good
> enough. 

That is interesting, wasn't aware of that those are still useful.

It really depends on the component as well, I was mostly thinking about
the name finder models when I wrote that.

Do you only use the Sentence Detector, Tokenizer and POS tagger?

You could use OntoNotes (almost for free) to train models. Maybe we
should look into distributing models trained on OntoNotes.

Jörn

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to