Re: Very very slow query when using a high OFFSET

dandh988 Sun, 17 Dec 2017 10:57:33 -0800

https://www.w3.org/TR/rdf-sparql-query/#specifyingDataset
Distinct is needed to ensure that multiple from clause don't generate 
duplicates. This is a RDF Merge detailed in 
https://www.w3.org/TR/rdf-mt/#graphdefs
Whether or not the distinct could be optimize out? It may or may not be allowed 
under the spec?
Mosaic uses parallel streams with distinct and Java will optimise out the 
distinct of it knows the steam is a singular source. Otherwise it chews heap 
because it needs to store what it's already streamed to adhere to the distinct.



Dick
-------- Original message --------From: Laura Morales <laure...@mail.com> Date: 
17/12/2017  17:53  (GMT+00:00) To: users@jena.apache.org Cc: 
users@jena.apache.org Subject: Re: Very very slow query when using a high 
OFFSET 
> It triggers "dynamic datasets" and the default graph does duplicate
> suppression to make multiple FROM work.

So, if I understand correctly, this is the correct behavior and you'll 
see/expect the same performance hit (query that doesn't terminate) even if you 
were using a TDB namedGraph instead of HDT. Right?
And, just out of curiosity, is there a way to fix this performance hit or is it 
just the way it is (I don't mean fix by changing query, I mean fix by answering 
the original query more efficiently)?

Re: Very very slow query when using a high OFFSET

Reply via email to