[
https://issues.apache.org/jira/browse/UIMA-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marshall Schor resolved UIMA-4003.
----------------------------------
Resolution: Fixed
> add alternative int - int maps, and int sets for better space/time performance
> ------------------------------------------------------------------------------
>
> Key: UIMA-4003
> URL: https://issues.apache.org/jira/browse/UIMA-4003
> Project: UIMA
> Issue Type: Improvement
> Components: Core Java Framework
> Affects Versions: 2.6.0SDK
> Reporter: Marshall Schor
> Assignee: Marshall Schor
> Priority: Minor
> Fix For: 2.6.1SDK
>
>
> int - int maps and int sets are implemented in UIMA as either red-black trees
> (size = 5 words (4 words for sets) + 1 bit per item, search time = log 2 size
> (binary search), insert /removal can cause rebalancing tree), or as
> intVectors (like ArrayList<Integers> but doesn't wrap ints as Integers).
> For int - int maps, add a hash version (loses key "ordering"), which takes 3
> - 6 words per item (avg 4.5 words - slightly smaller), and has O(1)
> performance (based on existing JCasHashMap impl, but without concurrency
> support).
> For int sets, add 2 styles: one based on the same hash map (space = 1.5 to 3
> words per item vs 4 words), the other based on BitSets. The BitSet one would
> be faster, would have a conventional key ordering, and would be smaller, if
> the max int stored < 128 * number of items stored. (This is often the case
> in uses where it is used to track Feature Structure addresses).
> Finally, add one adaptable int set that is usually is implemented as a
> BitSet, but if the items being stored are such that the size would be
> significantly larger than the IntHashSet implementation, switch to that
> alternative representation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)