Re: Human annotation tool for UIMA

Thilo Goetz Fri, 01 Jun 2007 00:21:33 -0700

Katrin Tomanek wrote:

Dear Andrew,
I am new to UIMA and am trying to find the best tool for doing doinghuman
document annotation.  For instance, if I am building a machine-learning
based named entity tagger and I want to tag some text with namedentities totrain my recognizers, what would be the best way to do that?
I think thats a matter of human/manual annotation. Generating trainingmaterial for ML is a laborious task which is not an issue of UIMA (asfar as I understand). Depending on the entities and the domain andlanguage you are interested in you might find annotated corpora (youmight check http://torvald.aksis.uib.no/corpora/ for existing corpora).
regards,
Katrin


Also check http://registry.dfki.de/ for software tools to manually
annotate text.  I have no personal experience with any of the tools
there, but I have heard Alembic being favorably mentioned.  It looks
like it is freely available.  It should be relatively easy to transform
the resulting XML to UIMA, either via XSLT, or with a custom XML
parser that reads the annotated data and feeds it into UIMA APIs.

BTW, I have recently hacked UIMA's CAS Visual Debugger for a colleague
to allow creating manual annotations.  That was a one-off, though, and
I haven't fed it back into the main code base.  If people are interested
in that kind of functionality, let me know.  We wouldn't want to compete
with a dedicated annotation tool, though.

--Thilo

Re: Human annotation tool for UIMA

Reply via email to