Some of this has been previously stated. I'm summarizing :-) ------------ It seems these would be nice to have at runtime, not just externally.
Assigning them at runtime has potential issues for "parallel" processing of CASes. Parallelism can arise in UIMA-AS scheduling using the flow controller parallel-step option. This can also arise in a simple application associated with a CAS Store, where the operation is to deserialize an existing CAS, add FSs to it, and reserialize the result back to the store *under the same CAS id*. The parallel use case here is that many of these operations could occur simultaneously. Of course, the reserializing would need to take account of the "high-water-mark" - just as is done for the flow-controller parallel-step option. In that case, we also declare it is "illegal" for annotators to update feature structures "below the high-water-mark", because if two annotators updated the same slot, then the later one would "win", and the previous update would be "lost". Running in parallel means it may be hard to assign at FS creation time the "next" available unique FS id - so that's a problem to address. -------------- Another (potential) problem: if the FS id is added, this represents potentially a significant increase in the CAS size. For some applications, this could be an issue. So I hope the architecture allows modes of operation where there is no space taken in the CAS for this. Something like this may be needed also for backwards compatibility. -------------- It may be that many FSs in the CAS won't need a unique FSid. An example: UIMA supports lists made out of Lisp-like "cons" cells - the FSList structure has 2 slots - one is a reference (or nil) to the next cons object, the other is a reference to the item in the list at that spot. I've seen applications that have 1000's or more of these cons cells. They are never individually "indexed" (except perhaps occasionally the "head" of the list), but just serve to create the list. I wonder if an architecture for unique FSids could account for this, and not have any overhead for some FeatureStructures which won't need a unique FSid. -Marshall
