On Tue, Oct 4, 2011 at 5:34 AM, Jörn Kottmann <[email protected]> wrote:
> In the end I believe a simple CAS ID field could be quite useful, for
> debugging/logging, as a
> document ID in simple UIMA pipelines and for applications which deal with
> whole CASes
> (e.g. the Cas Editor based annotation tooling, or an AE which extracts
> "problematic" CASes
> from an analysis pipeline for inspection).
>
> To implement this I suggest that we extend to CAS interface with
> CAS.setId(String) and CAS.getId() methods.

Historically in UIMA this document ID info is saved in the
SourceDocumentInformation annotation, in the uri feature. Many UIMA
SDK samples rely on the ID here. When applications want additional
metadata they then add features to the SourceDocumentInformation type
definition for that purpose.

If one were to implement CAS.setID() the data should be stored in the
CAS as a type/feature so that all of the different CAS serialization
and transport mechanisms are unchanged. Probably as an additional
feature in SofaFS would be best. Presumably this string would want to
be immutable (as are other SofaFS features)?

Still not clear to me that this feature adds value beyond application
specific type system data.

Eddie

Reply via email to