[
https://issues.apache.org/jira/browse/UIMA-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970082#comment-13970082
]
Marshall Schor commented on UIMA-3747:
--------------------------------------
The typeSystemMappers has as keys, instances of a target type system. In my
imagination of more routine kinds of use cases, there might be a few different
target type systems, but likely there will be just one. But it sounds like in
your example, there are a very large number of them, probably unbounded. Do
you think this is the case?
One possible issue: the target Type System uses the Object Identity for
equality testing. So, if somehow, your use case was creating (possibly
identical, but different objects) the same type system over and over again,
they would appear as distinct type systems. But if that were the case, and you
saw a way to reuse the same target type system object, that would "work around"
this issue.
I think, in general, a fix for this would be to re-design the typeSystemMappers
to be some kind of a memory-limited cache, and to have a way to expire target
type systems out of it. I'll look at doing something like that...
> Memory management problem with compressed binary deserialization
> ----------------------------------------------------------------
>
> Key: UIMA-3747
> URL: https://issues.apache.org/jira/browse/UIMA-3747
> Project: UIMA
> Issue Type: Bug
> Components: Core Java Framework
> Affects Versions: 2.4.2SDK
> Reporter: Richard Eckart de Castilho
> Assignee: Marshall Schor
>
> We think we stumbled across a memory management problem with the new
> compressed binary serialization when a CAS is reset/reused in a loop, e.g. in
> the uimaFIT SimplePipeline. When we use form 6, we consistently run into
> out-of-memory situations. Finally, we took the time to do a heap dump
> analysis.
> We found a huge TypeSystemImpl instance in the heap (~450MB). What makes it
> huge is the field "typeSystemMappers"
> that in our case contains 1000+ entries, each of them using apparently using
> a TypeSystemImpl as key.
> It looks like typeSystemMappers is never reset when a CAS is reused. My
> current theory is, that it should be reset when CAS.reset() is called,
> otherwise type systems accumulate there when the binary deserialization is
> used to repeatedly load data into a CAS in a loop that is resetting and
> reusing the CAS.
--
This message was sent by Atlassian JIRA
(v6.2#6252)