[ 
https://issues.apache.org/jira/browse/JENA-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372406#comment-14372406
 ] 

Stian Soiland-Reyes commented on JENA-901:
------------------------------------------

I also found that activeInterpreters leaks also if you don't iterate through to 
the end - calling .close() on the iterator is not enough. Clean-up seems to 
only happen when it.hasNext() is called and it returns false - so this could 
happen also in cases like where you return after getting the first hit.

{code}
@Test
        public void testNotLeakingActiveInterpreters() throws Exception {
                Graph data = Factory.createGraphMem();
                data.add(new Triple(a, ty, C1));
                data.add(new Triple(b, ty, C1));
                List<Rule> rules = Rule
                                .parseRules("[r1:  (?x p ?t) <- (?x rdf:type 
C1), makeInstance(?x, p, C2, ?t)]"
                                                + "[r2:  (?t rdf:type C2) <- 
(?x rdf:type C1), makeInstance(?x, p, C2, ?t)]");

                FBRuleInfGraph infgraph = (FBRuleInfGraph) 
createReasoner(rules).bind(
                                data);

                LPBRuleEngine engine = getEngineForGraph(infgraph);
                assertEquals(0, engine.activeInterpreters.size());
                assertEquals(0, engine.tabledGoals.size());

                // we ask for a non-hit -- it works, but only because we call 
it.hasNext()
                ExtendedIterator<Triple> it = infgraph.find(nohit, ty, C1);
                assertFalse(it.hasNext());
                it.close();
                assertEquals(0, engine.activeInterpreters.size());

                // and again.
                // Ensure this is not cached by asking for a different triple 
pattern
                ExtendedIterator<Triple> it2 = infgraph.find(nohit, ty, C2);
                // uuups, forgot to call it.hasNext(). But .close() should tidy
                it2.close();
                assertEquals(0, engine.activeInterpreters.size());

                
                // OK, let's ask for something that is in the graph
                
                ExtendedIterator<Triple> it3 = infgraph.find(a, ty, C1);
                assertTrue(it3.hasNext());
                assertEquals(a, it3.next().getMatchSubject());
                
                // .. and what if we forget to call next() to consume b?
                // (e.g. return from a method with the first hit)
                
                // this should be enough
                it3.close();
                // without leaks of activeInterpreters
                assertEquals(0, engine.activeInterpreters.size());
        }

{code}

I'll try to fix it in the same Pull-request.

> Make the cache of LPBRuleEngine bounded to avoid out-of-memory
> --------------------------------------------------------------
>
>                 Key: JENA-901
>                 URL: https://issues.apache.org/jira/browse/JENA-901
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: Reasoners
>    Affects Versions: Jena 2.12.1
>            Reporter: Jan De Beer
>
> The class "com.hp.hpl.jena.reasoner.rulesys.impl.LPBRuleEngine" uses an 
> in-memory cache named "tabledGoals", which has no limit as to the size/number 
> of entries stored.
> {noformat}
>     /** Table mapping tabled goals to generators for those goals.
>      *  This is here so that partial goal state can be shared across multiple 
> queries. */
>     protected HashMap<TriplePattern, Generator> tabledGoals = new HashMap<>();
> {noformat}
> We have experienced out-of-memory issues because of the cache being filled 
> with millions of entries in just a few days under normal query usage 
> conditions and a heap memory set to 3GB.
> In our setup, we have a dataset containing multiple graphs, some of them are 
> actual data graphs (backed by TDB), and then there are two which are ontology 
> models using a "TransitiveReasoner" and an "OWLMicroFBRuleReasoner", 
> respectively. A typical query may run over all the graphs in the dataset, 
> including the ontology ones (see below for a query template). Eventhough the 
> ontology graphs would not yield any additional results for data queries 
> (which is fine), the above mentioned cache would still fill up with new 
> entries.
> {noformat}
> SELECT ?p ?o
> WHERE {
>   GRAPH ?g {
>     <some resource of interest> ?p ?o .
>   }
> }
> {noformat}
> As there is no upper bound to the cache, soon or later all available heap 
> memory will be consumed by the cache, giving rise to an out-of-memory 
> criticial error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to