Honestly - probably because of lack of knowledge - I don't see how that can happen with the text index. You have a single triple pattern that is querying the Lucene index for the given pattern and returns by default at most 10 000 documents.

text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" )
translates to

( (prefLabel:"\"xx yy\"" OR altLabel:"\"xx yy\"") AND lang:en)
which indeed can return duplicate documents as for each triple a separate document is created and indexed.

I still don't get how a query with limit 1000 returning 560 then doesn't return 100 if using limit 100

Currently, I find your results quite counter intuitive, but I still have to learn a log when using RDF, SPARQL and Jena.


Can you share some data please to reproduce?

What happens for a single property only? Pagination should work as you're doing, the Lucene query is internally executed once, then cached - for later requests the same Lucene documents hits should be reused

On 19.10.22 08:21, Mikael Pesonen wrote:

Hi,

yes, same select as only query gets exactly limit amount of triples.

On 18/10/2022 16.48, Lorenz Buehmann wrote:
did you get those results when running only this subquery? Afaik, the default limit of the Lucene text query is at most 10 000 documents - and I don't think that the outer LIMIT would make it to the Lucene request


On 18.10.22 13:35, Mikael Pesonen wrote:

I have a bigger query that starts with inner select

 { SELECT ?s ?score WHERE {
    (?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" ) .
    } order by desc(?score) offset 0 limit 1000 }

There are about 10000 results. limit 1000 returns ~560 and limit 100 ~75 results. How do I page results correctly?

Reply via email to