[ 
https://issues.apache.org/jira/browse/UIMA-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13971394#comment-13971394
 ] 

Richard Eckart de Castilho edited comment on UIMA-3747 at 4/16/14 1:23 PM:
---------------------------------------------------------------------------

Not that I am aware of. 

You can reproduce the problem with the following simple test. Before running 
the test, remove the "private" modifier from TypeSystemImpl.typeSystemMappers.

{noformat}
  public void testCasReuseWithDifferentTypeSystems() throws Exception
  {
      // Create a CAS
      CAS cas = CasCreationUtils.createCas((TypeSystemDescription) null, null, 
null);
      cas.setDocumentLanguage("latin");
      cas.setDocumentText("test");

      // Serialize it
      ByteArrayOutputStream baos = new ByteArrayOutputStream(1024);
      Serialization.serializeWithCompression(cas, baos, cas.getTypeSystem());

      // Create a new CAS
      long min = Long.MAX_VALUE;
      long max = 0;
      CAS cas2 = CasCreationUtils.createCas((TypeSystemDescription) null, null, 
null);
      for (int i = 0; i < 100000; i++) {
          // Simulate us reinitializing the CAS with a new type system.
          TypeSystemImpl tgt = new TypeSystemImpl();
          for (int t = 0; t < 1000; t++) {
              tgt.addType("random"+t, tgt.getTopType());
          }
          tgt.commit();
          
          // Deserialize into the new type system
          ByteArrayInputStream bais = new 
ByteArrayInputStream(baos.toByteArray());
          Serialization.deserializeCAS(cas2, bais, tgt, null); 
          
          long cur = Runtime.getRuntime().totalMemory() - 
Runtime.getRuntime().freeMemory();
          max = Math.max(cur, max);
          min = Math.min(cur, min);
          if (i % 100 == 0) {
            System.out.printf("Cached: %d   Max: %d   Room left: %d   %n",
                  ((TypeSystemImpl) 
cas2.getTypeSystem()).typeSystemMappers.size(), max,
                  Runtime.getRuntime().maxMemory() - max);
          }
      }
  }
{noformat}

Eventually, the output screetches to a halt:

{noformat}
...
Cached: 2301   Max: 1466865472   Room left: 442067136   
Cached: 2401   Max: 1529083736   Room left: 379848872   
Cached: 2501   Max: 1583309160   Room left: 325623448   
Cached: 2601   Max: 1618738616   Room left: 290193992   
Cached: 2701   Max: 1661499672   Room left: 247432936   
Cached: 2801   Max: 1717535904   Room left: 191396704   
Cached: 2901   Max: 1717535904   Room left: 191396704
<hanging>
{noformat}


was (Author: rec):
Not that I am aware of. 

You can reproduce the problem with the following simple test. Before running 
the test, remove the "private" modifier from TypeSystemImpl.typeSystemMappers.

{noformat}
  public void testCasReuseWithDifferentTypeSystems() throws Exception
  {
      // Create a CAS
      CAS cas = CasCreationUtils.createCas((TypeSystemDescription) null, null, 
null);
      cas.setDocumentLanguage("latin");
      cas.setDocumentText("test");

      // Serialize it
      ByteArrayOutputStream baos = new ByteArrayOutputStream(1024);
      Serialization.serializeWithCompression(cas, baos, cas.getTypeSystem());

      // Create a new CAS
      long min = Long.MAX_VALUE;
      long max = 0;
      CAS cas2 = CasCreationUtils.createCas((TypeSystemDescription) null, null, 
null);
      for (int i = 0; i < 100000; i++) {
          // Simulate us reinitializing the CAS with a new type system.
          TypeSystemImpl tgt = new TypeSystemImpl();
          for (int t = 0; t < 1000; t++) {
              tgt.addType("random"+t, tgt.getTopType());
          }
          tgt.commit();
          
          // Deserialize into the new type system
          ByteArrayInputStream bais = new 
ByteArrayInputStream(baos.toByteArray());
          Serialization.deserializeCAS(cas2, bais, tgt, null); 
          
          long cur = Runtime.getRuntime().totalMemory() - 
Runtime.getRuntime().freeMemory();
          max = Math.max(cur, max);
          min = Math.min(cur, min);
          if (i % 100 == 0) {
            System.out.printf("Cached: %d   Max: %d   Room left: %d   %n",
                  ((TypeSystemImpl) 
cas2.getTypeSystem()).typeSystemMappers.size(), max,
                  Runtime.getRuntime().maxMemory() - max);
          }
      }
  }
{noformat}


> Memory management problem with compressed binary deserialization
> ----------------------------------------------------------------
>
>                 Key: UIMA-3747
>                 URL: https://issues.apache.org/jira/browse/UIMA-3747
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>    Affects Versions: 2.4.2SDK
>            Reporter: Richard Eckart de Castilho
>            Assignee: Marshall Schor
>             Fix For: 2.6.0SDK
>
>
> We think we stumbled across a memory management problem with the new 
> compressed binary serialization when a CAS is reset/reused in a loop, e.g. in 
> the uimaFIT SimplePipeline. When we use form 6, we consistently run into 
> out-of-memory situations. Finally, we took the time to do a heap dump 
> analysis.
> We found a huge TypeSystemImpl instance in the heap (~450MB). What makes it 
> huge is the field "typeSystemMappers"
> that in our case contains 1000+ entries, each of them using apparently using 
> a TypeSystemImpl as key.
> It looks like typeSystemMappers is never reset when a CAS is reused. My 
> current theory is, that it should be reset when CAS.reset() is called, 
> otherwise type systems accumulate there when the binary deserialization is 
> used to repeatedly load data into a CAS in a loop that is resetting and 
> reusing the CAS.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to