Am 04.10.2011 um 21:41 schrieb Eddie Epstein: > On Tue, Oct 4, 2011 at 5:34 AM, Jörn Kottmann <[email protected]> wrote: >> In the end I believe a simple CAS ID field could be quite useful, for >> debugging/logging, as a >> document ID in simple UIMA pipelines and for applications which deal with >> whole CASes >> (e.g. the Cas Editor based annotation tooling, or an AE which extracts >> "problematic" CASes >> from an analysis pipeline for inspection). >> >> To implement this I suggest that we extend to CAS interface with >> CAS.setId(String) and CAS.getId() methods. > If one were to implement CAS.setID() the data should be stored in the > CAS as a type/feature so that all of the different CAS serialization > and transport mechanisms are unchanged. Probably as an additional > feature in SofaFS would be best. Presumably this string would want to > be immutable (as are other SofaFS features)? > > Still not clear to me that this feature adds value beyond application > specific type system data.
I can well understand the use cases by Jörn and Burn in which they need an identifier in the CAS which can be used to associate a context with the CAS that holds information not contained in the CAS itself: either a UIMA-AS context or some database storage knowledge. I have also had the desire at some times to associate arbitrary Java objects with a particular CAS instance. I ended up creating a static HashMap using the CAS instance itself as key, but I would have preferred some kind of "session" information that is associated with the CAS itself (there is a session in the UIMAContext of components, but not for the CAS). I consider that solution a hack tough because it only works in a non-distributed environment. For a distributed environment like UIMA-AS, associating an ID with a CAS in such a manner that no custom FeatureStructure is required would be convenient. I still wonder if such an ID should be associated with each separately view or with the whole CAS object. I also wonder if it would not be good to have a generic string key/value properties to associate with a CAS or view. That could substitute the SourceDocumentInformation, allow for arbitrary metadata such as is generated by the TikaAnnotator and could be used to store Jörn's DB ID and Burn's UIMA-AS ID and my URI/baseURI - and even all of that at the same time. If there is only one ID field, different applications might compete for that. It could be stored in the SofaFS (mutable please) and there could be convenient CAS.getProperty(String) and CAS.setProperty(String,String) methods. Cheers, Richard -- ------------------------------------------------------------------- Richard Eckart de Castilho Technical Lead Ubiquitous Knowledge Processing Lab FB 20 Computer Science Department Technische Universität Darmstadt Hochschulstr. 10, D-64289 Darmstadt, Germany phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117 [email protected] www.ukp.tu-darmstadt.de Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de -------------------------------------------------------------------
