[ 
https://issues.apache.org/jira/browse/JENA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346614#comment-17346614
 ] 

Lorenz Bühmann commented on JENA-2107:
--------------------------------------

[~andy] fixed the commit message.

And sure, I can test it with our dataset and post the numbers here to get an 
idea of the performance gain. Indeed, some benchmark generator as well as some 
test queries would be better, but I'm sure somebody will do this for RDF-star 
anyways in the near future.

(by the way, I also tried a quick fix for TDB1/TDB2 locally, it's basically the 
same I guess, except for calling {{convToBinding}} method to get the binding 
from the ID)

We were also wondering if you're thinking of any index structures for the 
embedded triples? At least for the top level it would be possible - though, to 
be fair,, already here we would have to do it for subject and object position 
and then each permutation ... sounds overkill, I think? Especially as there 
isn't currently that much demand, as usual a tradeoff

 

[~Aklakan] what would be the purpose of tracking the sizes? (ok, we can discuss 
internally in the office later)

> RDF Star performance issue with non-concrete node triples
> ---------------------------------------------------------
>
>                 Key: JENA-2107
>                 URL: https://issues.apache.org/jira/browse/JENA-2107
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>    Affects Versions: Jena 3.17.0, Jena 4.0.0
>            Reporter: Lorenz Bühmann
>            Priority: Critical
>             Fix For: Jena 4.1.0
>
>
> the following graph pattern is not evaluated efficiently (results in 
> full-scan per binding) because the second triple pattern doesn't take 
> advantage of the bindings generated by evaluation of the first one:
> {code:java}
> ?s <p> ?o .  
> << ?s <p> ?o >> <p2> ?v .
> {code}
> A possible fix would be to adapt the method {{rdfStarTripleSub()}} in class
>  
> [SolverRX3.java|https://github.com/apache/jena/blob/2efff8a00b4ffa82751cf46c8a3fed84b6ff3090/jena-arq/src/main/java/org/apache/jena/sparql/engine/main/solver/SolverRX3.java#L63-L71]
>  by changing the beginning to
> {code:java}
> private static Iterator<Binding> rdfStarTripleSub(Binding input, Triple 
> xPattern, ExecutionContext execCxt) {
>         Triple tPattern = Substitute.substitute(xPattern, input);
> {code}
> We went from 75s for a very small dataset (50k triples) to near instant 
> response times.
> If this fix is correct and doesn't break anything, it might be the same way 
> to fix for its quads counterpart in {{SolverRX4}} class.
>  
> Note, for tdbquery, this seems to be evaluated at a different place? At 
> least, we couldn't find any performance improvement, it's still horribly slow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to