An example of this was given by Andrew Bredenkamp of Acrolynx at the
SAS2011. In the Penn TreeBank corpus the word "object" is a VERB 99% of the
time, but if you are dealing with the SAP corpus, in most cases it refers
to an instance of a class.

On Sun, Dec 11, 2011 at 2:48 PM, Jason Baldridge
<jasonbaldri...@gmail.com>wrote:

> Yep. Domain adaptation (and dealing with new languages) are as important,
> or more important, in NLP as they are in general for other types of
> problems that are addressed with machine learning. Once we get better at
> injecting better prior information about language (in the general sense)
> into our models, maybe that will start looking better.
>
> On Sun, Dec 11, 2011 at 11:04 AM, Josh Patterson <j...@cloudera.com>
> wrote:
>
> > ok, that makes more sense. I'm not that familiar with how training
> > affects NLP, but I am versed in training for general ML purposes ---
> > which seems to be the same idea here.
> >
> > Thanks,
> >
> > JP
> >
> > On Sun, Dec 11, 2011 at 9:12 AM, Jason Baldridge
> > <jasonbaldri...@gmail.com> wrote:
> > > For new domains (e.g. Twitter) and/or new languages, or using more data
> > to
> > > get a better model. -Jason
> > >
> > > On Sat, Dec 10, 2011 at 10:07 PM, Josh Patterson <j...@cloudera.com>
> > wrote:
> > >
> > >> working with the examples and reading:
> > >>
> > >>
> > >>
> >
> http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Sentence_Detector
> > >>
> > >> I've noticed the section on "Training"; Given that the models already
> > >> detect things like sentences and POS, in what circumstances would one
> > >> want to "train" the model further?
> > >>
> > >> Josh
> > >>
> > >> --
> > >> Twitter: @jpatanooga
> > >> Solution Architect @ Cloudera
> > >> hadoop: http://www.cloudera.com
> > >>
> > >
> > >
> > >
> > > --
> > > Jason Baldridge
> > > Associate Professor, Department of Linguistics
> > > The University of Texas at Austin
> > > http://www.jasonbaldridge.com
> > > http://twitter.com/jasonbaldridge
> >
> >
> >
> > --
> > Twitter: @jpatanooga
> > Solution Architect @ Cloudera
> > hadoop: http://www.cloudera.com
> >
>
>
>
> --
> Jason Baldridge
> Associate Professor, Department of Linguistics
> The University of Texas at Austin
> http://www.jasonbaldridge.com
> http://twitter.com/jasonbaldridge
>

Reply via email to