I've spent some time on this but with limited results. At heart this looks a long standing and deep seated bug in the backward chainer but I'm not yet sure how to safely fix it.

I can reproduce the problem fine using the recipe in the ticket but haven't yet managed to create a stand alone test case.

More annoyingly if I pause in a debugger then the test case works. I thought this was because of the switch of tabledGoals in LBRuleEngine to a weak hashmap (for JENA-901) but if I remove the weakValues() call the test remains non-deterministic.

That aside, the problem is, as you might expect, the tabling of goals.

The backward chainer can mark any or all predicates as tabled. Any goal (basically a triple pattern) which is tabled is satisfied by getting a Generator for it from the tabledGoals store. These goals might be from the top level query or from body terms in the chain of rules being fired. The Generator instance stores both the results for the goal known so far and an interpreter instance with associated state that can be used to generate more answers. Part of that interpreter state can be an actual triple query to the underlying data. That's what these TopLevelTripleMatchFrame instances are.

The Generators in the tabledGoals table are expected to outlast individual queries (kind of the point of them) but that means that if the query didn't run to completion then the Generators can hold on to state include unclosed TopLevelTripleMatchFrame instances and so to iterators in the store. Bad.

If the top level query is closed early then the LPTopGoalIterator does try to clean up the engine state by looking for Generators that have completed, and does close the top level interpreter. However, it doesn't seem to do anything about the remaining inflight Generators and I assume that's the underlying problem. There needs to be some sort of a scan to close those down and removed them from the tabledGoals (so as to not poison future runs by having an incomplete set of results in the goal table). That all looks really tricky to do when there can be concurrent top level queries all pulling on the same set of tabled generators.

However, there's lots I don't understand about the current behaviour (amazing what you can forget about complex code in 10 years!).

- I'm surprised that LPInterpreter doesn't also close any associated TopLevelTripleMatchFrame. That on its own wouldn't (and doesn't) fix this problem because the relevant interpreter is not itself closed but I'm surprised we don't see lots of unclosed iterators lying around (or at least not that we've noticed).

- Given that some interpreters *are* closed (e.g. the top level one) I would have expected those to need to be removed from the tabledGoals if they weren't complete. I can't see any code to do that but if it's not being done I can't understand how the system works at all!

So no real help yet I'm afraid.

Dave


On 08/06/2019 12:34, Andy Seaborne wrote:
Even just some pointers as to where in the code it should close TopLevelTripleMatchFrame would be helpful if getting it setup is a barrier.  (if it should close it) I don't know the rules codes well enough.

-------------------

The example on JENA-1719 after editing the <file:> and tdb2:location runs.

Files in /home/afs/DIR/


Load data:
cd DIR
tdb2.tdbloader --loc=graph data.nt

I run it in Eclipse with

public static void main(String[] argv) throws Exception {
   FusekiMainCmd.main("--config=/home/afs/DIR/example.ttl");
}

My example.ttl below.

     Many thanks
     Andy

On 08/06/2019 12:22, Dave Reynolds wrote:
Hi Andy,

Sure, I'll try to take a look over the weekend if I can get a working dev environment set up.

Dave

On 07/06/2019 13:46, Andy Seaborne wrote:
Dave,

I am hoping you can give me a some pointers so I can solve JENA-1719.
https://issues.apache.org/jira/browse/JENA-1719

Setup:
   data in TDB2.
   ontology in a memory model.
   RDFSExptRuleReasoner

First query has a short limit:
   select * where {?s a <http://example.com/ns/Person>} limit 1
This returns one answer.

The iterator (LPTopGoalIterator) over the inf graph does get closed.

Second query has a longer limit
   select * where {?s a <http://example.com/ns/Person>} limit 1000

(If the long query is used first, the second query works)

Problem:

The second query is using an iterator created during the first query.

During the LIMIT 1 query:

TopLevelTripleMatchFrame constructor
TopLevelTripleMatchFrame constructor
TopLevelTripleMatchFrame constructor
TopLevelTripleMatchFrame constructor
TopLevelTripleMatchFrame constructor
closeIterator[?s, rdf:type, http://example.com/ns/Person] <-- The query
LPInterpreter.close[cpFrame:ConsumerChoicePointFrame]
LPInterpreter.close[cpFrame:null]
LPInterpreter.close[cpFrame:ConsumerChoicePointFrame]
LPInterpreter.close[cpFrame:null]


TopLevelTripleMatchFrame.close is not called - and the same is true genrally during startup.

     Thanks,
     Andy

@prefix :        <#> .
@prefix fuseki:  <http://jena.apache.org/fuseki#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb2:    <http://jena.apache.org/2016/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .

<#service> rdf:type fuseki:Service ;
     rdfs:label                      "Fuseki service" ;
     fuseki:name                     "example" ;
     fuseki:serviceQuery             "query" ;
     fuseki:serviceQuery             "sparql" ;
     fuseki:serviceUpdate            "update" ;
     fuseki:serviceUpload            "upload" ;
     fuseki:serviceReadWriteGraphStore      "data" ;
     fuseki:serviceReadGraphStore    "get" ;
     fuseki:dataset           <#inf_dataset> ;
     .

<#inf_dataset> rdf:type ja:RDFDataset ;
     ja:defaultGraph <#model> .

<#model> rdf:type ja:InfModel ;
     ja:reasoner <#reasoner> ;
     ja:baseModel <#tdb_graph> .

<#reasoner> rdf:type ja:ReasonerFactory ;
     ja:reasonerURL <http://jena.hpl.hp.com/2003/RDFSExptRuleReasoner> ;
     ja:schema <#ontology> .

<#ontology> rdf:type ja:MemoryModel ;
     ja:content [ ja:externalContent <file:///home/afs/DIR/example.owl> ] .

<#tdb_graph> rdf:type tdb2:GraphTDB ;
     tdb2:dataset <#tdb2_dataset> .

<#tdb2_dataset> rdf:type tdb2:DatasetTDB2 ;
     tdb2:location "/home/afs/DIR/graph" .

Reply via email to