Thanks for the information Josh, I want a model to identify the topic for the given website(this is actually for student to identify subject), for this I am using document categorizer with my own corpus with nearly 2 GB, (for eg: science <space> describing about science) .
Thanks, Johnson. On Sun, Dec 11, 2011 at 9:37 AM, Josh Patterson <j...@cloudera.com> wrote: > working with the examples and reading: > > > http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Sentence_Detector > > I've noticed the section on "Training"; Given that the models already > detect things like sentences and POS, in what circumstances would one > want to "train" the model further? > > Josh > > -- > Twitter: @jpatanooga > Solution Architect @ Cloudera > hadoop: http://www.cloudera.com >