An example of this was given by Andrew Bredenkamp of Acrolynx at the SAS2011. In the Penn TreeBank corpus the word "object" is a VERB 99% of the time, but if you are dealing with the SAP corpus, in most cases it refers to an instance of a class.
On Sun, Dec 11, 2011 at 2:48 PM, Jason Baldridge <jasonbaldri...@gmail.com>wrote: > Yep. Domain adaptation (and dealing with new languages) are as important, > or more important, in NLP as they are in general for other types of > problems that are addressed with machine learning. Once we get better at > injecting better prior information about language (in the general sense) > into our models, maybe that will start looking better. > > On Sun, Dec 11, 2011 at 11:04 AM, Josh Patterson <j...@cloudera.com> > wrote: > > > ok, that makes more sense. I'm not that familiar with how training > > affects NLP, but I am versed in training for general ML purposes --- > > which seems to be the same idea here. > > > > Thanks, > > > > JP > > > > On Sun, Dec 11, 2011 at 9:12 AM, Jason Baldridge > > <jasonbaldri...@gmail.com> wrote: > > > For new domains (e.g. Twitter) and/or new languages, or using more data > > to > > > get a better model. -Jason > > > > > > On Sat, Dec 10, 2011 at 10:07 PM, Josh Patterson <j...@cloudera.com> > > wrote: > > > > > >> working with the examples and reading: > > >> > > >> > > >> > > > http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Sentence_Detector > > >> > > >> I've noticed the section on "Training"; Given that the models already > > >> detect things like sentences and POS, in what circumstances would one > > >> want to "train" the model further? > > >> > > >> Josh > > >> > > >> -- > > >> Twitter: @jpatanooga > > >> Solution Architect @ Cloudera > > >> hadoop: http://www.cloudera.com > > >> > > > > > > > > > > > > -- > > > Jason Baldridge > > > Associate Professor, Department of Linguistics > > > The University of Texas at Austin > > > http://www.jasonbaldridge.com > > > http://twitter.com/jasonbaldridge > > > > > > > > -- > > Twitter: @jpatanooga > > Solution Architect @ Cloudera > > hadoop: http://www.cloudera.com > > > > > > -- > Jason Baldridge > Associate Professor, Department of Linguistics > The University of Texas at Austin > http://www.jasonbaldridge.com > http://twitter.com/jasonbaldridge >