+1
On Wed, Apr 17, 2013 at 11:07 PM, Jason Baldridge <jasonbaldri...@gmail.com>wrote: > +1 to doing this. I already removed that from Chalk for similar reasons. > Also, the best way to do coreference these days is to build on the > rule-based sieve approach given in this paper: > > http://www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00152 > > -Jason > > > On Wed, Apr 17, 2013 at 4:31 PM, Jörn Kottmann <kottm...@gmail.com> wrote: > > > Hi all, > > > > I am proposing that we move the coref component into the sandbox until we > > manage > > to train and test it on a publicly available dataset. In the current > state > > it is complicated to maintain the > > code because without training it can't be tested properly, which makes > > bigger changes on OpenNLP > > difficult, for example the maxent refactoring. > > > > I tried to implement parsers for the MUC corpus and added training code, > > but it does not yet work as > > well as the current models on SourceForge. More work is needed to get > > everything fixed. > > > > Additionally the code should be refactored like the other components in > > OpenNLP, > > e.g. one model instantiation, build in evaluation, simple training, etc. > > There is a jira issue with > > all the details. > > > > Any opinions? > > > > Jörn > > > > > > -- > Jason Baldridge > Associate Professor, Department of Linguistics > The University of Texas at Austin > http://www.jasonbaldridge.com > http://twitter.com/jasonbaldridge >