Coref problem (was: JWNL)

Jörn Kottmann Thu, 17 Nov 2011 01:44:08 -0800

We shouldn't replace JWNL with a newer version,
because we currently don't have the ability to train
or evaluate the coref component.


This is a big issue for us because that also blocks
other changes and updates to the code itself,
e.g. the cleanups Aliaksandr contributed.

What we need here is a plan how we can get the coref component
into a state which makes it possible to develop it in a community.

If we don't find a way to resolve this I think we should move the corefstuff

to the sandbox and leave it there until we have some training data.
Don't having the ability to train coref also blocks changes we might want
to do the our maxent library.

Maybe it is possible to buy a license for MUC 6 and 7 data, so we can share

this data privately by the team. Are any people familiar if that wouldbe possible

with the LDC license?

The CONLL2011 data (OntoNotes, costs 50$) might also be suitable totrain it:

http://conll.bbn.com/index.php/data.html

Another option would be label enough wikinews data, so we are able totrain it.


Jörn

On 11/17/11 5:50 AM, James Kosin wrote:

All,

I just saw this that may be interesting to update to...
     http://sourceforge.net/projects/extjwnl/

James

Coref problem (was: JWNL)

Reply via email to