[
https://issues.apache.org/jira/browse/UIMA-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13971394#comment-13971394
]
Richard Eckart de Castilho edited comment on UIMA-3747 at 4/16/14 1:23 PM:
---------------------------------------------------------------------------
Not that I am aware of.
You can reproduce the problem with the following simple test. Before running
the test, remove the "private" modifier from TypeSystemImpl.typeSystemMappers.
{noformat}
public void testCasReuseWithDifferentTypeSystems() throws Exception
{
// Create a CAS
CAS cas = CasCreationUtils.createCas((TypeSystemDescription) null, null,
null);
cas.setDocumentLanguage("latin");
cas.setDocumentText("test");
// Serialize it
ByteArrayOutputStream baos = new ByteArrayOutputStream(1024);
Serialization.serializeWithCompression(cas, baos, cas.getTypeSystem());
// Create a new CAS
long min = Long.MAX_VALUE;
long max = 0;
CAS cas2 = CasCreationUtils.createCas((TypeSystemDescription) null, null,
null);
for (int i = 0; i < 100000; i++) {
// Simulate us reinitializing the CAS with a new type system.
TypeSystemImpl tgt = new TypeSystemImpl();
for (int t = 0; t < 1000; t++) {
tgt.addType("random"+t, tgt.getTopType());
}
tgt.commit();
// Deserialize into the new type system
ByteArrayInputStream bais = new
ByteArrayInputStream(baos.toByteArray());
Serialization.deserializeCAS(cas2, bais, tgt, null);
long cur = Runtime.getRuntime().totalMemory() -
Runtime.getRuntime().freeMemory();
max = Math.max(cur, max);
min = Math.min(cur, min);
if (i % 100 == 0) {
System.out.printf("Cached: %d Max: %d Room left: %d %n",
((TypeSystemImpl)
cas2.getTypeSystem()).typeSystemMappers.size(), max,
Runtime.getRuntime().maxMemory() - max);
}
}
}
{noformat}
Eventually, the output screetches to a halt:
{noformat}
...
Cached: 2301 Max: 1466865472 Room left: 442067136
Cached: 2401 Max: 1529083736 Room left: 379848872
Cached: 2501 Max: 1583309160 Room left: 325623448
Cached: 2601 Max: 1618738616 Room left: 290193992
Cached: 2701 Max: 1661499672 Room left: 247432936
Cached: 2801 Max: 1717535904 Room left: 191396704
Cached: 2901 Max: 1717535904 Room left: 191396704
<hanging>
{noformat}
was (Author: rec):
Not that I am aware of.
You can reproduce the problem with the following simple test. Before running
the test, remove the "private" modifier from TypeSystemImpl.typeSystemMappers.
{noformat}
public void testCasReuseWithDifferentTypeSystems() throws Exception
{
// Create a CAS
CAS cas = CasCreationUtils.createCas((TypeSystemDescription) null, null,
null);
cas.setDocumentLanguage("latin");
cas.setDocumentText("test");
// Serialize it
ByteArrayOutputStream baos = new ByteArrayOutputStream(1024);
Serialization.serializeWithCompression(cas, baos, cas.getTypeSystem());
// Create a new CAS
long min = Long.MAX_VALUE;
long max = 0;
CAS cas2 = CasCreationUtils.createCas((TypeSystemDescription) null, null,
null);
for (int i = 0; i < 100000; i++) {
// Simulate us reinitializing the CAS with a new type system.
TypeSystemImpl tgt = new TypeSystemImpl();
for (int t = 0; t < 1000; t++) {
tgt.addType("random"+t, tgt.getTopType());
}
tgt.commit();
// Deserialize into the new type system
ByteArrayInputStream bais = new
ByteArrayInputStream(baos.toByteArray());
Serialization.deserializeCAS(cas2, bais, tgt, null);
long cur = Runtime.getRuntime().totalMemory() -
Runtime.getRuntime().freeMemory();
max = Math.max(cur, max);
min = Math.min(cur, min);
if (i % 100 == 0) {
System.out.printf("Cached: %d Max: %d Room left: %d %n",
((TypeSystemImpl)
cas2.getTypeSystem()).typeSystemMappers.size(), max,
Runtime.getRuntime().maxMemory() - max);
}
}
}
{noformat}
> Memory management problem with compressed binary deserialization
> ----------------------------------------------------------------
>
> Key: UIMA-3747
> URL: https://issues.apache.org/jira/browse/UIMA-3747
> Project: UIMA
> Issue Type: Bug
> Components: Core Java Framework
> Affects Versions: 2.4.2SDK
> Reporter: Richard Eckart de Castilho
> Assignee: Marshall Schor
> Fix For: 2.6.0SDK
>
>
> We think we stumbled across a memory management problem with the new
> compressed binary serialization when a CAS is reset/reused in a loop, e.g. in
> the uimaFIT SimplePipeline. When we use form 6, we consistently run into
> out-of-memory situations. Finally, we took the time to do a heap dump
> analysis.
> We found a huge TypeSystemImpl instance in the heap (~450MB). What makes it
> huge is the field "typeSystemMappers"
> that in our case contains 1000+ entries, each of them using apparently using
> a TypeSystemImpl as key.
> It looks like typeSystemMappers is never reset when a CAS is reused. My
> current theory is, that it should be reset when CAS.reset() is called,
> otherwise type systems accumulate there when the binary deserialization is
> used to repeatedly load data into a CAS in a loop that is resetting and
> reusing the CAS.
--
This message was sent by Atlassian JIRA
(v6.2#6252)