This has been stewing for a long time, so it's time to get it out in the open here.
Users have difficulty figuring out how to use FeatureStructures that are not derived from annotations (and not intended to just be subordinate objects referenced from annotations). I have personally had to help several who wanted to create such an FS, add it to the CAS, and get it back out later, but couldn't figure out how to proceed. The answer of course is that they need to define a custom index, even if they don't care about the sort order. But instead, a common workaround is to just make the type inherit from Annotation and just ignore the begin, end features. In practice that's a lot easier than dealing with a custom index. The most common use case for this is when the object is a "singleton", for example a DocumentMetaData object, in which case there's another not-so-nice solution: add features to DocumentAnnotation. That hinders interoperability, though, so it would be nice to give users another, easy way to do this. However, I think the issue is more general than singletons, and could apply anytime the user wants to just add and retrieve FS from the CAS without caring about their ordering. I think it's a weakness of UIMA that we make this so difficult to do, and that we should try to improve this. I'm open to whatever designs people can come up with to address this. Eddie, Marshall, and I had a proposal quite some time ago that we were never able to acheive consensus for. Although, I'm not sure it was 100% understood what we were proposing at the time. The basic idea is that CAS.addFsToIndexes(FS) and IndexRepository.addFs(FS) should *always* add the FS to an index. If no appropriate index occurs we just create a bag index. The FS can be retrieved by using IndexRepository.getAllIndexedFS(Type). The thinking is that if an annotator bothered to try to add something to the indexes, there was a reason for it, and it's a whole lot better to respect that than to just silently ignore it. Note that this doesn't cause any loss in performance if an annotator never adds an FS to the indexes. We still support subordinate FS that are linked off of other FS but never indexed. I've heard of a rare case where users might have "optional" indexes. The idea is that an annotator might call addFsToIndexes "just in case" some downstream component might actually care about such an index. Then when used in an application that doesn't require that index, the index is not defined in the descriptor, making the addFsToIndexes call a no-op. That's the only case that would suffer a performance impact as a result of implementing this proposal. But I think this is easily addressed via a configuration parameter (and, I think I heard in the one annotator that did this, it is already replaced by a configuration parameter anyway, since that provided even better performance than having to check if the index exists for every FS that was created). In summary I think this design has a lot of nice properties. It makes it very easy to add things to the CAS and get them out again, if you don't care about order. At a later point, if you start to care about the sort order, you can go back and easily define a sorted index. There's almost no effect of this proposal on existing code, excepting the one rare case from the last paragraph which had its own performance problems anyway and has an easy fix. -Adam
