I posted a Jira for a proposed change in how 0-length UIMA arrays and lists are managed. These are immutable objects, and (theoretically) one instance (per CAS) could be shared.
In the current implementation, this is managed explicitly by the user - they can use a bunch of new APIs to get shared instances. I'm thinking a better way is to make this automatically the case, and remove the new bunch of APIs (a smaller API set is always a good thing, for essentially the same functionality, IMHO). The implementation would change so that the calls that create "new" 0-length arrays/lists would instead of creating a new one, only do that if none already existed, and if one already did, it would return that one. This follows Java's general direction for immutable objects, like Strings and Integer values, which can be shared. For cases where people wanted/needed a CAS value "marker" that was tiny, but unique (like you get with Java's new Object()), we would keep "new TOP(aCas)" as something that generated unique instances. What do others think? I've seen large-scale implementations of UIMA pipelines with lots of defaulted 0-length arrays in them; this has the potential to improve space/time performance a reasonable amount for these. -Marshall