[ 
https://issues.apache.org/jira/browse/UIMA-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marshall Schor updated UIMA-4357:
---------------------------------
    Component/s: Core Java Framework

> create auxiliary flattened version of index and its subtypes, automatically 
> managed
> -----------------------------------------------------------------------------------
>
>                 Key: UIMA-4357
>                 URL: https://issues.apache.org/jira/browse/UIMA-4357
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>            Reporter: Marshall Schor
>            Priority: Minor
>             Fix For: 2.7.1SDK
>
>
> UIMA indexes allow retrieving items from the CAS, trading off space (for 
> indexes) for time (speed of finding items in the CAS, speed of iterating).  
> For sorted indexes over a type with subtypes, if the index isn't being 
> modified, it is possible to do a one-time extraction in sorted order of the 
> items and save this in an array, and iterate much more rapidly over that. 
> I've seen lots of cases of UIMA flows where some annotators will create and 
> index a type (and its subtypes), and once that's been done, the indexes are 
> not subsequently updated for these types, but downstream annotators iterate 
> over them.  It seems that a lazy creation for this kind of flattened index 
> would work well in many cases.
> It is important, I think, to continue to preserve the same kind of 
> ConcurrentModificationException detection.  To make this additional index 
> space-time trade-off automatic and reasonable, make the additional index 
> reachable via a SoftReference, to allow the GC to reclaim the space if 
> needed.  
> Delay the creation of a flattened version until there's evidence that it will 
> be unmodified for some time.  To count things that motivate its creation, 
> count the number of times an iterator over an index is using the code 
> "heapifyUp/Down" that manages the ordering of the subiterators to preserve 
> sort order.  A basic indicator may be the number of times that occurs, 
> without an intervening update to the indexes, relative to the size of the 
> index.
> The flattened array can save a bit more time by holding references to the 
> Java cover class (JCas or non-JCas) for this object. 
> Cas Reset needs to clear out these things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to