On 30/06/2019 10:42, Dave Reynolds wrote:
Having spent some more time on this I still can't find a safe fix to the
underlying problem but have created a brute force work around.
https://github.com/apache/jena/pull/580
As it says in the PR it might need some serious redesign of the backward
engine to properly fix the underlying issue, it really shouldn't be this
hard to clean up partial state despite all the tabling. That's not
something I can help with in the foreseeable future, hence going for a
workaround.
I suspect I've got the workflow wrong with this PR. I used my normal git
process of developing the changes in a branch and then issuing a PR
against that branch but maybe that's not the right thing to do with the
apache setup. Apologies if I've messed up. Should I have done this is my
own fork instead?
It's worked and I've merged it. A branch in the main repo has a similar
flow, including needing any local tidy-up afterwards.
If it is of any size, or rather of any length in time, a clone+branch is
probably the way to go, but for immediate fixes (esp. when your jena
working copy is already "in use"), direct branches seem Ok to me at least.
Andy
Dave
On 09/06/2019 22:56, Dave Reynolds wrote:
I've spent some time on this but with limited results. At heart this
looks a long standing and deep seated bug in the backward chainer but
I'm not yet sure how to safely fix it.
I can reproduce the problem fine using the recipe in the ticket but
haven't yet managed to create a stand alone test case.
More annoyingly if I pause in a debugger then the test case works. I
thought this was because of the switch of tabledGoals in LBRuleEngine
to a weak hashmap (for JENA-901) but if I remove the weakValues() call
the test remains non-deterministic.
That aside, the problem is, as you might expect, the tabling of goals.
The backward chainer can mark any or all predicates as tabled. Any
goal (basically a triple pattern) which is tabled is satisfied by
getting a Generator for it from the tabledGoals store. These goals
might be from the top level query or from body terms in the chain of
rules being fired. The Generator instance stores both the results for
the goal known so far and an interpreter instance with associated
state that can be used to generate more answers. Part of that
interpreter state can be an actual triple query to the underlying
data. That's what these TopLevelTripleMatchFrame instances are.
The Generators in the tabledGoals table are expected to outlast
individual queries (kind of the point of them) but that means that if
the query didn't run to completion then the Generators can hold on to
state include unclosed TopLevelTripleMatchFrame instances and so to
iterators in the store. Bad.
If the top level query is closed early then the LPTopGoalIterator does
try to clean up the engine state by looking for Generators that have
completed, and does close the top level interpreter. However, it
doesn't seem to do anything about the remaining inflight Generators
and I assume that's the underlying problem. There needs to be some
sort of a scan to close those down and removed them from the
tabledGoals (so as to not poison future runs by having an incomplete
set of results in the goal table). That all looks really tricky to do
when there can be concurrent top level queries all pulling on the same
set of tabled generators.
However, there's lots I don't understand about the current behaviour
(amazing what you can forget about complex code in 10 years!).
- I'm surprised that LPInterpreter doesn't also close any associated
TopLevelTripleMatchFrame. That on its own wouldn't (and doesn't) fix
this problem because the relevant interpreter is not itself closed but
I'm surprised we don't see lots of unclosed iterators lying around (or
at least not that we've noticed).
- Given that some interpreters *are* closed (e.g. the top level one) I
would have expected those to need to be removed from the tabledGoals
if they weren't complete. I can't see any code to do that but if it's
not being done I can't understand how the system works at all!
So no real help yet I'm afraid.
Dave
On 08/06/2019 12:34, Andy Seaborne wrote:
Even just some pointers as to where in the code it should close
TopLevelTripleMatchFrame would be helpful if getting it setup is a
barrier. (if it should close it) I don't know the rules codes well
enough.
-------------------
The example on JENA-1719 after editing the <file:> and tdb2:location
runs.
Files in /home/afs/DIR/
Load data:
cd DIR
tdb2.tdbloader --loc=graph data.nt
I run it in Eclipse with
public static void main(String[] argv) throws Exception {
FusekiMainCmd.main("--config=/home/afs/DIR/example.ttl");
}
My example.ttl below.
Many thanks
Andy
On 08/06/2019 12:22, Dave Reynolds wrote:
Hi Andy,
Sure, I'll try to take a look over the weekend if I can get a
working dev environment set up.
Dave
On 07/06/2019 13:46, Andy Seaborne wrote:
Dave,
I am hoping you can give me a some pointers so I can solve JENA-1719.
https://issues.apache.org/jira/browse/JENA-1719
Setup:
data in TDB2.
ontology in a memory model.
RDFSExptRuleReasoner
First query has a short limit:
select * where {?s a <http://example.com/ns/Person>} limit 1
This returns one answer.
The iterator (LPTopGoalIterator) over the inf graph does get closed.
Second query has a longer limit
select * where {?s a <http://example.com/ns/Person>} limit 1000
(If the long query is used first, the second query works)
Problem:
The second query is using an iterator created during the first query.
During the LIMIT 1 query:
TopLevelTripleMatchFrame constructor
TopLevelTripleMatchFrame constructor
TopLevelTripleMatchFrame constructor
TopLevelTripleMatchFrame constructor
TopLevelTripleMatchFrame constructor
closeIterator[?s, rdf:type, http://example.com/ns/Person] <-- The
query
LPInterpreter.close[cpFrame:ConsumerChoicePointFrame]
LPInterpreter.close[cpFrame:null]
LPInterpreter.close[cpFrame:ConsumerChoicePointFrame]
LPInterpreter.close[cpFrame:null]
TopLevelTripleMatchFrame.close is not called - and the same is true
genrally during startup.
Thanks,
Andy
@prefix : <#> .
@prefix fuseki: <http://jena.apache.org/fuseki#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb2: <http://jena.apache.org/2016/tdb#> .
@prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
<#service> rdf:type fuseki:Service ;
rdfs:label "Fuseki service" ;
fuseki:name "example" ;
fuseki:serviceQuery "query" ;
fuseki:serviceQuery "sparql" ;
fuseki:serviceUpdate "update" ;
fuseki:serviceUpload "upload" ;
fuseki:serviceReadWriteGraphStore "data" ;
fuseki:serviceReadGraphStore "get" ;
fuseki:dataset <#inf_dataset> ;
.
<#inf_dataset> rdf:type ja:RDFDataset ;
ja:defaultGraph <#model> .
<#model> rdf:type ja:InfModel ;
ja:reasoner <#reasoner> ;
ja:baseModel <#tdb_graph> .
<#reasoner> rdf:type ja:ReasonerFactory ;
ja:reasonerURL <http://jena.hpl.hp.com/2003/RDFSExptRuleReasoner> ;
ja:schema <#ontology> .
<#ontology> rdf:type ja:MemoryModel ;
ja:content [ ja:externalContent
<file:///home/afs/DIR/example.owl> ] .
<#tdb_graph> rdf:type tdb2:GraphTDB ;
tdb2:dataset <#tdb2_dataset> .
<#tdb2_dataset> rdf:type tdb2:DatasetTDB2 ;
tdb2:location "/home/afs/DIR/graph" .