I'm glad I happened to browse the archive today! I just joined the list
today because I have noticed a couple of bugs that I want to post
somewhere. So, I developed and maintain Knowtator and am also steeped in
UIMA technology - I have been using it for just over a year and a half
now. I would love to make Knowtator tightly integrated with UIMA. The
frame-based representation that Knowtator uses via Protege (Frames
edition) is *very* similar to the type system framework in UIMA. The
Protege frames is actually much more expressive and I would be surprised
if the representational capability does not completely subsume UIMA's
type system representational capability. Knowtator can not make use out
of some of the more advanced representational constructs such as slot
inheritance (i.e. superslots and subslots, e.g. has-characteristic might
subsume has-color) or multiple inheritance - but I think it handles
anything that can be represented in a UIMA type system. I think it would
be a really nice fit.
A one-off solution as described previously is going to be extremely
frustrating. Things that you might want to do with an annotation tool
include:
- calculate inter-annotator agreement in a wide variety of ways
- consolidate/adjudicate disagreement between annotators
- mundane data management tasks
- keep track of who annotated what
- merge sets of annotations together
- stand-alone annotation on a laptop (a perk for annotators if it is a
part-time job that they can do during hours of their own choosing)
- work on / visualize subsets of annotations
- there are a bajillion user-interface considerations
I'm sure there are many other things I haven't thought of off the top of
my head. We have been working really hard on Knowtator the last few
months and have tried to make it much easier to get started with and
more user friendly. If you have looked at it previously and got
discouraged, then I encourage you to take a look at the latest version
and the updated documentation. It is still a clunky with respect to
importing and exporting annotation data (which we are currently working
on to make easier) - but having a UIMA solution would make this problem
go away for this community. We have created some one-off scripts to go
from one to the other that we could possibly make available with a
little effort if there is interest.
If you have ideas about how this effort could be funded I would be
grateful for suggestions. We are considering applying for an Eclipse
Innovation Award as an appropriate venue but we don't really know what
the odds are of getting it funded for this work. Or, if you have
interest in working on this yourself, I would be thrilled to provide
expertise.
Thanks,
Philip
----include----
One manual annotation tool that is open source is Knowtator (which is
licensed under MPL 1.1). As I understand it, Knowtator is intended for
manual annotation entities and relationships in text. It is a layer on
top of the Prot�g� open source ontology editor. I'm not really familiar
enough with Knowtator to explicitly recommend it. Considering its
stated goals and the framework that it was developed on, it seems like
it might be particularly well suited to enabling manual annotations for
relatively elaborate type systems that have a lot of structure and many
common relation annotation types. The flip side is that it may be
overkill for the (more common) task of marking up instances of a flat
list of named-entity types. In any event, my point here is just that
anyone who is thinking of building a mapping from an open source manual
annotation tool to UIMA may want to consider Knowtator, especially if
they are interested in a lot of expressive power.