[
https://issues.apache.org/jira/browse/UIMA-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marshall Schor updated UIMA-4357:
---------------------------------
Component/s: Core Java Framework
> create auxiliary flattened version of index and its subtypes, automatically
> managed
> -----------------------------------------------------------------------------------
>
> Key: UIMA-4357
> URL: https://issues.apache.org/jira/browse/UIMA-4357
> Project: UIMA
> Issue Type: Improvement
> Components: Core Java Framework
> Reporter: Marshall Schor
> Priority: Minor
> Fix For: 2.7.1SDK
>
>
> UIMA indexes allow retrieving items from the CAS, trading off space (for
> indexes) for time (speed of finding items in the CAS, speed of iterating).
> For sorted indexes over a type with subtypes, if the index isn't being
> modified, it is possible to do a one-time extraction in sorted order of the
> items and save this in an array, and iterate much more rapidly over that.
> I've seen lots of cases of UIMA flows where some annotators will create and
> index a type (and its subtypes), and once that's been done, the indexes are
> not subsequently updated for these types, but downstream annotators iterate
> over them. It seems that a lazy creation for this kind of flattened index
> would work well in many cases.
> It is important, I think, to continue to preserve the same kind of
> ConcurrentModificationException detection. To make this additional index
> space-time trade-off automatic and reasonable, make the additional index
> reachable via a SoftReference, to allow the GC to reclaim the space if
> needed.
> Delay the creation of a flattened version until there's evidence that it will
> be unmodified for some time. To count things that motivate its creation,
> count the number of times an iterator over an index is using the code
> "heapifyUp/Down" that manages the ordering of the subiterators to preserve
> sort order. A basic indicator may be the number of times that occurs,
> without an intervening update to the indexes, relative to the size of the
> index.
> The flattened array can save a bit more time by holding references to the
> Java cover class (JCas or non-JCas) for this object.
> Cas Reset needs to clear out these things.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)