[ 
https://issues.apache.org/jira/browse/UIMA-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marshall Schor resolved UIMA-4003.
----------------------------------
    Resolution: Fixed

> add alternative int - int maps, and int sets for better space/time performance
> ------------------------------------------------------------------------------
>
>                 Key: UIMA-4003
>                 URL: https://issues.apache.org/jira/browse/UIMA-4003
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>    Affects Versions: 2.6.0SDK
>            Reporter: Marshall Schor
>            Assignee: Marshall Schor
>            Priority: Minor
>             Fix For: 2.6.1SDK
>
>
> int - int maps and int sets are implemented in UIMA as either red-black trees 
> (size = 5 words (4 words for sets) + 1 bit per item, search time = log 2 size 
> (binary search), insert /removal can cause rebalancing tree), or as 
> intVectors (like ArrayList<Integers> but doesn't wrap ints as Integers).
> For int - int maps, add a hash version (loses key "ordering"), which takes 3 
> - 6 words per item (avg 4.5 words - slightly smaller), and has O(1) 
> performance (based on existing JCasHashMap impl, but without concurrency 
> support).
> For int sets, add 2 styles: one based on the same hash map (space = 1.5 to 3 
> words per item vs 4 words), the other based on BitSets.  The BitSet one would 
> be faster, would have a conventional key ordering, and would be smaller, if 
> the max int stored < 128 * number of items stored.  (This is often the case 
> in uses where it is used to track Feature Structure addresses).
> Finally, add one adaptable int set that is usually is implemented as a 
> BitSet, but if the items being stored are such that the size would be 
> significantly larger than the IntHashSet implementation, switch to that 
> alternative representation.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to