On 7/4/11 7:20 PM, Olivier Grisel wrote:
Keeping the correct link position from the original markup while
cleaning it can be tricky though. Be careful when tweaking the parser.
Maybe the Span helper classes from OpenNLP could help make this code
more robust.

I wonder how important the links are here, because we do not want to throw
away sentences which do not have links covering their entities.

But I believe the links might be very interesting for entity identification,
if lets say a person name is labeled, and also covered by a link. The link
can be used to identify the person mention.

And after we have a few manually labeled articles we can use the links to
generate special features which are passed to the name finder.

So in the end, do we just generate an annotation for every link?!

Jörn

Reply via email to