[
https://issues.apache.org/jira/browse/UIMA-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15887805#comment-15887805
]
Dennis Bauer commented on UIMA-5306:
------------------------------------
I'm wondering about the native implementation, I think, it's done with C++. If
you're trying to get C++ and Java together, I saw that in the Code, you always
need to have an ID system, to point to the date from Java in C++ memory. Maybe
these adresses consume a lot of memory.
I have been working with drools some times, a rule engine / language to map
business knowledge to business rules. There are some algorithms like PHREAK
(take a look:
http://blog.athico.com/2013/11/rip-rete-time-to-get-phreaky.html), that shall
protect a constant workflow and reduce memory consumption. Maybe it would be
interesting, if Ruta could be optimized in that way too, so that the rules will
be sorted in a way, so every expected annotation would be made, but also, you
have the ability to do a breakpoint in Ruta execution, where you can remove all
annotations made to this timestamp of execution, that, in regard to an
optimization at first, won't be part or used in following rules, so you always
have a clear memory consumption and you can release unnecessary used memory.
I'd like to participate in Ruta, but the project is like that gigantic, I
haven't time to get into this project and all of its components. The RutaParser
class with its 22000 LOC are just overwhelming
> Memory Improvement - Unnecessary leaks
> --------------------------------------
>
> Key: UIMA-5306
> URL: https://issues.apache.org/jira/browse/UIMA-5306
> Project: UIMA
> Issue Type: Improvement
> Components: Ruta
> Affects Versions: 2.3.0ruta
> Environment: Windows 10, JVM with -Xmx 1024, Java JDK 1.8., 16gb
> memory
> Reporter: Dennis Bauer
> Assignee: Peter Klügl
>
> In a productive setup we figured out, that there is a huge memory usage of
> Ruta itself. With JVisualVM it's easy to see, that there is a relative small
> amount of arrays of Arraylists but with a high memory consumption (250k
> instances result in 243 000 000 byte memory that are reserved)
> The problem is, that in a clustered SaaS environment with less memory, these
> arrays block relevant space in memory. A deeper look into these Arrays of
> Arraylist let suggest the class org.apache.uima.ruta.type.RutaBasic
> A look at this class show three arrays that are instanced with the max.
> possible value, that can be returned by the typesystem of CAS.
> {code:Java}
> private int[] partOf = new int[((TypeSystemImpl)
> getCAS().getTypeSystem()).getLargestTypeCode()];
> private Collection<?>[] beginMap = new ArrayList<?>[((TypeSystemImpl)
> getCAS().getTypeSystem())
> .getLargestTypeCode()];
> private Collection<?>[] endMap = new ArrayList<?>[((TypeSystemImpl)
> getCAS().getTypeSystem())
> .getLargestTypeCode()];
>
> {code}
> In this improvement should be done an dynamic allocation of memory usage for
> these arrays, so the total memory consumption would be reduced.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)