No, I mean letting the parser tag words itself. The per-tag accuracy might be lower, but parsers do better using their predicted tags. Mike Collins did some experiments with this back in the late 1990's, and I think Dan Bikel did too in his reimplementation of Collins' parser. But, come to think of it, this is not true for Ratnaparkhi's parser (which is what the OpenNLP parser is based on) since it is discriminative, not generative. Anyway, the point is that this isn't always an obvious thing.
On Wed, Jul 6, 2011 at 8:07 AM, Jörn Kottmann <kottm...@gmail.com> wrote: > On 7/6/11 2:57 PM, Jason Baldridge wrote: > >> Regardless of more data, it actually is typically better to let a parser >> tag >> words by itself rather than to use a separate tagger. >> > > So "by itself" you mean the POS Tagger trained on the parser training data? > > Jörn > -- Jason Baldridge Assistant Professor, Department of Linguistics The University of Texas at Austin http://www.jasonbaldridge.com http://twitter.com/jasonbaldridge