https://www.w3.org/TR/rdf-sparql-query/#specifyingDataset Distinct is needed to ensure that multiple from clause don't generate duplicates. This is a RDF Merge detailed inĀ https://www.w3.org/TR/rdf-mt/#graphdefs Whether or not the distinct could be optimize out? It may or may not be allowed under the spec? Mosaic uses parallel streams with distinct and Java will optimise out the distinct of it knows the steam is a singular source. Otherwise it chews heap because it needs to store what it's already streamed to adhere to the distinct.
Dick -------- Original message --------From: Laura Morales <laure...@mail.com> Date: 17/12/2017 17:53 (GMT+00:00) To: users@jena.apache.org Cc: users@jena.apache.org Subject: Re: Very very slow query when using a high OFFSET > It triggers "dynamic datasets" and the default graph does duplicate > suppression to make multiple FROM work. So, if I understand correctly, this is the correct behavior and you'll see/expect the same performance hit (query that doesn't terminate) even if you were using a TDB namedGraph instead of HDT. Right? And, just out of curiosity, is there a way to fix this performance hit or is it just the way it is (I don't mean fix by changing query, I mean fix by answering the original query more efficiently)?