Re: sparql query performance jena v2, 3, 4

Andy Seaborne Tue, 28 Feb 2023 09:19:15 -0800



On 28/02/2023 03:11, Paul Tyson wrote:

I maintain an old jena/fuseki application that has happily been usingjena v2.13 and tdb v1.1.2 for several years. It loads 1b+ triples into atdb database, and runs a couple dozen queries, some not so trivial, onthe tdb.
Now it is time to update things. I first went to 3.17, to stay on java8.Many of the queries work fine, but a few have abysmal performance. Aquery that took maybe 10 minutes with v2.13 now runs for hours withoutfinishing.
I am now trying v4.7 with java11. Testing is still in progress, but itdoesn't look promising.
The troublesome queries have several FILTER EXISTS and FILTER NOT EXISTSclauses, some of which have UNION patterns. It is rather complicated,but also a fairly literal translation of the applicable business rules.I took a closer look at them, and adjusted the order of patterns to putthe more-specific ones earlier, but that didn't help. I discovered thateliminating the UNION alternatives would let the query return someresults, but obviously not what is wanted.
Did anything in particular change in the query processing since v2 thatwould cause this performance degradation?

v2.13 was March 2015. A lot has changed since then including fixes whereoptimization would get the wrong answers. Some are directly EXISTS, somearen't but if you have complex EXISTS patterns, they can be impacted.They aren't pattern orders.


(Mostly they will be in JIRA)

Should I expect any difference in tdb vs tdb2? I've tried both, andneither give satisfaction.


Unlikely. TDB2 is preferred.


Thanks in advance,
--Paul

Re: sparql query performance jena v2, 3, 4

Reply via email to