[ 
https://issues.apache.org/jira/browse/UIMA-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15887757#comment-15887757
 ] 

Dennis Bauer commented on UIMA-5306:
------------------------------------

Again, I made some performance tests with the remaining two classes that threw 
an exception.

Here are some results, I've got with the large text, we actually use for the 
performance. The memory values are variyng a bit, because we use Ruta in our 
own software landscape as a service, where around 50mb memory can be 
subtracted. To check them out against each other, there is no problem to use 
these values.


As reference, we had a test with the old RutaBasic class, that used 1,44min for 
a response over our Rest-API. A maximum peak in memory was found at 830mb and 
it has an average of around 700mb memory usage.


Your first alternative version of this class didn't had an improvement. Its 
responsetime was 1,45min, with a peak at 850mb and and average of 710mb. In 
regards of measuring errors, that's like the old RutaBasic-class we'd use.

The second alternative took 1,46min to give a response, had a max. peak at 
800mb and an average of 500mb. We can suggest, that this would be a proper 
solution, if the peaks won't break the Runtime and scale with the max. Heapsize.

The third one took 1,29min for a response, so this one was really fast, 
therefore, the memory consumption rises to 930mb at the max. peak, with an 
average of also 500mb. Also like the second alternative, I would prefer this 
one, because it's faster and has also a good average consumption, but I'm 
worrying about the max. peak and its behaviour, e.g. we would reduce the 
heapsize from 1gb down to 700mb.

The fourth one took 1,42min, has a peak at 910mb and an average of 700mb, so 
it's no improvement to the RutaBasic-class at all.


I'll try out the point, I've mentioned for the third version, if we can get the 
big test also through a smaller heapsize. 

> Memory Improvement - Unnecessary leaks
> --------------------------------------
>
>                 Key: UIMA-5306
>                 URL: https://issues.apache.org/jira/browse/UIMA-5306
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Ruta
>    Affects Versions: 2.3.0ruta
>         Environment: Windows 10, JVM with -Xmx 1024, Java JDK 1.8., 16gb 
> memory
>            Reporter: Dennis Bauer
>            Assignee: Peter Klügl
>
> In a productive setup we figured out, that there is a huge memory usage of 
> Ruta itself. With JVisualVM it's easy to see, that there is a relative small 
> amount of arrays of Arraylists but with a high memory consumption (250k 
> instances result in 243 000 000 byte memory that are reserved)
> The problem is, that in a clustered SaaS environment with less memory, these 
> arrays block relevant space in memory. A deeper look into these Arrays of 
> Arraylist let suggest the class org.apache.uima.ruta.type.RutaBasic
> A look at this class show three arrays that are instanced with the max. 
> possible value, that can be returned by the typesystem of CAS. 
> {code:Java}
>   private int[] partOf = new int[((TypeSystemImpl) 
> getCAS().getTypeSystem()).getLargestTypeCode()];
>   private Collection<?>[] beginMap = new ArrayList<?>[((TypeSystemImpl) 
> getCAS().getTypeSystem())
>           .getLargestTypeCode()];
>   private Collection<?>[] endMap = new ArrayList<?>[((TypeSystemImpl) 
> getCAS().getTypeSystem())
>           .getLargestTypeCode()];
>                 
> {code}
> In this improvement should be done an dynamic allocation of memory usage for 
> these arrays, so the total memory consumption would be reduced.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to