[
https://issues.apache.org/jira/browse/JENA-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372406#comment-14372406
]
Stian Soiland-Reyes commented on JENA-901:
------------------------------------------
I also found that activeInterpreters leaks also if you don't iterate through to
the end - calling .close() on the iterator is not enough. Clean-up seems to
only happen when it.hasNext() is called and it returns false - so this could
happen also in cases like where you return after getting the first hit.
{code}
@Test
public void testNotLeakingActiveInterpreters() throws Exception {
Graph data = Factory.createGraphMem();
data.add(new Triple(a, ty, C1));
data.add(new Triple(b, ty, C1));
List<Rule> rules = Rule
.parseRules("[r1: (?x p ?t) <- (?x rdf:type
C1), makeInstance(?x, p, C2, ?t)]"
+ "[r2: (?t rdf:type C2) <-
(?x rdf:type C1), makeInstance(?x, p, C2, ?t)]");
FBRuleInfGraph infgraph = (FBRuleInfGraph)
createReasoner(rules).bind(
data);
LPBRuleEngine engine = getEngineForGraph(infgraph);
assertEquals(0, engine.activeInterpreters.size());
assertEquals(0, engine.tabledGoals.size());
// we ask for a non-hit -- it works, but only because we call
it.hasNext()
ExtendedIterator<Triple> it = infgraph.find(nohit, ty, C1);
assertFalse(it.hasNext());
it.close();
assertEquals(0, engine.activeInterpreters.size());
// and again.
// Ensure this is not cached by asking for a different triple
pattern
ExtendedIterator<Triple> it2 = infgraph.find(nohit, ty, C2);
// uuups, forgot to call it.hasNext(). But .close() should tidy
it2.close();
assertEquals(0, engine.activeInterpreters.size());
// OK, let's ask for something that is in the graph
ExtendedIterator<Triple> it3 = infgraph.find(a, ty, C1);
assertTrue(it3.hasNext());
assertEquals(a, it3.next().getMatchSubject());
// .. and what if we forget to call next() to consume b?
// (e.g. return from a method with the first hit)
// this should be enough
it3.close();
// without leaks of activeInterpreters
assertEquals(0, engine.activeInterpreters.size());
}
{code}
I'll try to fix it in the same Pull-request.
> Make the cache of LPBRuleEngine bounded to avoid out-of-memory
> --------------------------------------------------------------
>
> Key: JENA-901
> URL: https://issues.apache.org/jira/browse/JENA-901
> Project: Apache Jena
> Issue Type: Improvement
> Components: Reasoners
> Affects Versions: Jena 2.12.1
> Reporter: Jan De Beer
>
> The class "com.hp.hpl.jena.reasoner.rulesys.impl.LPBRuleEngine" uses an
> in-memory cache named "tabledGoals", which has no limit as to the size/number
> of entries stored.
> {noformat}
> /** Table mapping tabled goals to generators for those goals.
> * This is here so that partial goal state can be shared across multiple
> queries. */
> protected HashMap<TriplePattern, Generator> tabledGoals = new HashMap<>();
> {noformat}
> We have experienced out-of-memory issues because of the cache being filled
> with millions of entries in just a few days under normal query usage
> conditions and a heap memory set to 3GB.
> In our setup, we have a dataset containing multiple graphs, some of them are
> actual data graphs (backed by TDB), and then there are two which are ontology
> models using a "TransitiveReasoner" and an "OWLMicroFBRuleReasoner",
> respectively. A typical query may run over all the graphs in the dataset,
> including the ontology ones (see below for a query template). Eventhough the
> ontology graphs would not yield any additional results for data queries
> (which is fine), the above mentioned cache would still fill up with new
> entries.
> {noformat}
> SELECT ?p ?o
> WHERE {
> GRAPH ?g {
> <some resource of interest> ?p ?o .
> }
> }
> {noformat}
> As there is no upper bound to the cache, soon or later all available heap
> memory will be consumed by the cache, giving rise to an out-of-memory
> criticial error.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)