On 10/4/11 9:41 PM, Eddie Epstein wrote:
Historically in UIMA this document ID info is saved in the
SourceDocumentInformation annotation, in the uri feature. Many UIMA
SDK samples rely on the ID here. When applications want additional
metadata they then add features to the SourceDocumentInformation type
definition for that purpose.

I usually define my own Document Id Feature Structure which contains
my unique id. I always thought that is a bit cumbersome to use, and
wondered if having an ID field per CAS might help, but it sounds like
that there are good reasons why it was never implemented.

In the OpenNLP UIMA Integration I have AEs which can do the training
of the components, one issue there is that it is hard to map log messages
to the actual CAS where was caused by. To solve this I will now just add
a type mapping so a user can configure his custom Id Feature Structure type
and feature.

Jörn

Reply via email to