[ 
https://issues.apache.org/jira/browse/JENA-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523142#comment-14523142
 ] 

ASF GitHub Bot commented on JENA-901:
-------------------------------------

Github user afs commented on a diff in the pull request:

    https://github.com/apache/jena/pull/47#discussion_r29499213
  
    --- Diff: 
jena-core/src/main/java/com/hp/hpl/jena/reasoner/rulesys/impl/LPBRuleEngine.java
 ---
    @@ -252,19 +272,31 @@ public synchronized void tablePredicate(Node 
predicate) {
         /**
          * Return a generator for the given goal (assumes that the caller 
knows that
          * the goal should be tabled).
    +     * 
    +     * Note: If an earlier Generator for the same <code>goal</code> exists 
in the
    +     * cache, it will be returned without considering the provided 
<code>clauses</code>.
    +     * 
          * @param goal the goal whose results are to be generated
          * @param clauses the precomputed set of code blocks used to implement 
the goal
          */
    -    public synchronized Generator generatorFor(TriplePattern goal, 
List<RuleClauseCode> clauses) {
    -        Generator generator = tabledGoals.get(goal);
    -        if (generator == null) {
    -            LPInterpreter interpreter = new LPInterpreter(this, goal, 
clauses, false);
    -            activeInterpreters.add(interpreter);
    -            generator = new Generator(interpreter, goal);
    -            schedule(generator);
    -            tabledGoals.put(goal, generator);
    -        }
    -        return generator;
    +    public synchronized Generator generatorFor(final TriplePattern goal, 
final List<RuleClauseCode> clauses) {
    +           return getCachedTabledGoal(goal, new Callable<Generator>() {
    +                   @Override
    +               public Generator call() {
    +                           /** FIXME: Unify with 
#generatorFor(TriplePattern) - but investigate what about
    +                            * the edge case that this method might have 
been called with the of goal == null
    +                            * or goal.size()==0 -- which gives different 
behaviour in 
    +                            * LPInterpreter constructor than through the 
route of
    +                            * generatorFor(TriplePattern) which calls a 
different LPInterpreter constructor
    +                            * which would fill in from RuleStore. 
    +                            */  
    --- End diff --
    
    I don't know enough about the rules engine to comment.  We need another 
reviewer here.


> Make the cache of LPBRuleEngine bounded to avoid out-of-memory
> --------------------------------------------------------------
>
>                 Key: JENA-901
>                 URL: https://issues.apache.org/jira/browse/JENA-901
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: Reasoners
>    Affects Versions: Jena 2.12.1
>            Reporter: Jan De Beer
>
> The class "com.hp.hpl.jena.reasoner.rulesys.impl.LPBRuleEngine" uses an 
> in-memory cache named "tabledGoals", which has no limit as to the size/number 
> of entries stored.
> {noformat}
>     /** Table mapping tabled goals to generators for those goals.
>      *  This is here so that partial goal state can be shared across multiple 
> queries. */
>     protected HashMap<TriplePattern, Generator> tabledGoals = new HashMap<>();
> {noformat}
> We have experienced out-of-memory issues because of the cache being filled 
> with millions of entries in just a few days under normal query usage 
> conditions and a heap memory set to 3GB.
> In our setup, we have a dataset containing multiple graphs, some of them are 
> actual data graphs (backed by TDB), and then there are two which are ontology 
> models using a "TransitiveReasoner" and an "OWLMicroFBRuleReasoner", 
> respectively. A typical query may run over all the graphs in the dataset, 
> including the ontology ones (see below for a query template). Eventhough the 
> ontology graphs would not yield any additional results for data queries 
> (which is fine), the above mentioned cache would still fill up with new 
> entries.
> {noformat}
> SELECT ?p ?o
> WHERE {
>   GRAPH ?g {
>     <some resource of interest> ?p ?o .
>   }
> }
> {noformat}
> As there is no upper bound to the cache, soon or later all available heap 
> memory will be consumed by the cache, giving rise to an out-of-memory 
> criticial error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to