[
https://issues.apache.org/jira/browse/UIMA-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thilo Goetz reopened UIMA-1068:
-------------------------------
Fix in 2.2.2 hotfix 1.
> Use of the JCas cache should be configurable
> --------------------------------------------
>
> Key: UIMA-1068
> URL: https://issues.apache.org/jira/browse/UIMA-1068
> Project: UIMA
> Issue Type: Improvement
> Components: Core Java Framework
> Affects Versions: 2.2.2
> Reporter: Thilo Goetz
> Assignee: Thilo Goetz
> Fix For: 2.3
>
>
> The JCas caches all CAS objects that are accessed through it. This means
> that JCas objects that are no longer used can't be garbage collected. If
> only part of the processing chain uses the JCas, or the caching is redundant
> for some other reason, this produces a severe memory overhead.
> I ran the same experiment I ran for UIMA-1067: doubled the size of Moby Dick
> and ran the POS tagger from the sandbox. I used the improved version from
> UIMA-1067 as base case and simply commented out the line that adds JCas
> objects to the cache. This reduced the required heap size from 115MB to
> 105MB. It also improved the performance from around 10s for the base case to
> consistently under 9s for the version without any caching. I looked at the
> tagger source code, and saw that it keeps its own list of tokens around. So
> the savings are just the caching data structure.
> There may be cases where the JCas cache is a performance win, though I'd be
> curious to see the benchmarks. So we should not just turn it off, but make
> it configurable.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.