Re: SPARQL limit doesn't work
I had to reset all Jena data since server ran out of memory with drop graph. Now with clean data paging works. I'll let you know if problem repeats. On 20/10/2022 9.37, Lorenz Buehmann wrote: On 19.10.22 13:44, Mikael Pesonen wrote: On 19/10/2022 10.18, Lorenz Buehmann wrote: Honestly - probably because of lack of knowledge - I don't see how that can happen with the text index. You have a single triple pattern that is querying the Lucene index for the given pattern and returns by default at most 10 000 documents. text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" ) translates to ( (prefLabel:"\"xx yy\"" OR altLabel:"\"xx yy\"") AND lang:en) which indeed can return duplicate documents as for each triple a separate document is created and indexed. I still don't get how a query with limit 1000 returning 560 then doesn't return 100 if using limit 100 Currently, I find your results quite counter intuitive, but I still have to learn a log when using RDF, SPARQL and Jena. Can you share some data please to reproduce? Unfortunately I can't share the data. Of course when time, I could create similar dummy index. What happens for a single property only? What does this mean? you're querying two properties aka two fields in the Lucene query. What if you just use skos:prefLabel ? Pagination should work as you're doing, the Lucene query is internally executed once, then cached - for later requests the same Lucene documents hits should be reused On 19.10.22 08:21, Mikael Pesonen wrote: Hi, yes, same select as only query gets exactly limit amount of triples. On 18/10/2022 16.48, Lorenz Buehmann wrote: did you get those results when running only this subquery? Afaik, the default limit of the Lucene text query is at most 10 000 documents - and I don't think that the outer LIMIT would make it to the Lucene request On 18.10.22 13:35, Mikael Pesonen wrote: I have a bigger query that starts with inner select { SELECT ?s ?score WHERE { (?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" ) . } order by desc(?score) offset 0 limit 1000 } There are about 1 results. limit 1000 returns ~560 and limit 100 ~75 results. How do I page results correctly? -- Lingsoft - 30 years of Leading Language Management www.lingsoft.fi Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books Mikael Pesonen System Engineer e-mail: mikael.peso...@lingsoft.fi Tel. +358 2 279 3300 Time zone: GMT+2 Helsinki Office Eteläranta 10 FI-00130 Helsinki FINLAND Turku Office Kauppiaskatu 5 A FI-20100 Turku FINLAND
Re: Re: SPARQL limit doesn't work
On 19.10.22 13:44, Mikael Pesonen wrote: On 19/10/2022 10.18, Lorenz Buehmann wrote: Honestly - probably because of lack of knowledge - I don't see how that can happen with the text index. You have a single triple pattern that is querying the Lucene index for the given pattern and returns by default at most 10 000 documents. text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" ) translates to ( (prefLabel:"\"xx yy\"" OR altLabel:"\"xx yy\"") AND lang:en) which indeed can return duplicate documents as for each triple a separate document is created and indexed. I still don't get how a query with limit 1000 returning 560 then doesn't return 100 if using limit 100 Currently, I find your results quite counter intuitive, but I still have to learn a log when using RDF, SPARQL and Jena. Can you share some data please to reproduce? Unfortunately I can't share the data. Of course when time, I could create similar dummy index. What happens for a single property only? What does this mean? you're querying two properties aka two fields in the Lucene query. What if you just use skos:prefLabel ? Pagination should work as you're doing, the Lucene query is internally executed once, then cached - for later requests the same Lucene documents hits should be reused On 19.10.22 08:21, Mikael Pesonen wrote: Hi, yes, same select as only query gets exactly limit amount of triples. On 18/10/2022 16.48, Lorenz Buehmann wrote: did you get those results when running only this subquery? Afaik, the default limit of the Lucene text query is at most 10 000 documents - and I don't think that the outer LIMIT would make it to the Lucene request On 18.10.22 13:35, Mikael Pesonen wrote: I have a bigger query that starts with inner select { SELECT ?s ?score WHERE { (?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" ) . } order by desc(?score) offset 0 limit 1000 } There are about 1 results. limit 1000 returns ~560 and limit 100 ~75 results. How do I page results correctly?
Re: SPARQL limit doesn't work
On 19/10/2022 10.18, Lorenz Buehmann wrote: Honestly - probably because of lack of knowledge - I don't see how that can happen with the text index. You have a single triple pattern that is querying the Lucene index for the given pattern and returns by default at most 10 000 documents. text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" ) translates to ( (prefLabel:"\"xx yy\"" OR altLabel:"\"xx yy\"") AND lang:en) which indeed can return duplicate documents as for each triple a separate document is created and indexed. I still don't get how a query with limit 1000 returning 560 then doesn't return 100 if using limit 100 Currently, I find your results quite counter intuitive, but I still have to learn a log when using RDF, SPARQL and Jena. Can you share some data please to reproduce? Unfortunately I can't share the data. Of course when time, I could create similar dummy index. What happens for a single property only? What does this mean? Pagination should work as you're doing, the Lucene query is internally executed once, then cached - for later requests the same Lucene documents hits should be reused On 19.10.22 08:21, Mikael Pesonen wrote: Hi, yes, same select as only query gets exactly limit amount of triples. On 18/10/2022 16.48, Lorenz Buehmann wrote: did you get those results when running only this subquery? Afaik, the default limit of the Lucene text query is at most 10 000 documents - and I don't think that the outer LIMIT would make it to the Lucene request On 18.10.22 13:35, Mikael Pesonen wrote: I have a bigger query that starts with inner select { SELECT ?s ?score WHERE { (?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" ) . } order by desc(?score) offset 0 limit 1000 } There are about 1 results. limit 1000 returns ~560 and limit 100 ~75 results. How do I page results correctly? -- Lingsoft - 30 years of Leading Language Management www.lingsoft.fi Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books Mikael Pesonen System Engineer e-mail: mikael.peso...@lingsoft.fi Tel. +358 2 279 3300 Time zone: GMT+2 Helsinki Office Eteläranta 10 FI-00130 Helsinki FINLAND Turku Office Kauppiaskatu 5 A FI-20100 Turku FINLAND
Re: Re: SPARQL limit doesn't work
Honestly - probably because of lack of knowledge - I don't see how that can happen with the text index. You have a single triple pattern that is querying the Lucene index for the given pattern and returns by default at most 10 000 documents. text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" ) translates to ( (prefLabel:"\"xx yy\"" OR altLabel:"\"xx yy\"") AND lang:en) which indeed can return duplicate documents as for each triple a separate document is created and indexed. I still don't get how a query with limit 1000 returning 560 then doesn't return 100 if using limit 100 Currently, I find your results quite counter intuitive, but I still have to learn a log when using RDF, SPARQL and Jena. Can you share some data please to reproduce? What happens for a single property only? Pagination should work as you're doing, the Lucene query is internally executed once, then cached - for later requests the same Lucene documents hits should be reused On 19.10.22 08:21, Mikael Pesonen wrote: Hi, yes, same select as only query gets exactly limit amount of triples. On 18/10/2022 16.48, Lorenz Buehmann wrote: did you get those results when running only this subquery? Afaik, the default limit of the Lucene text query is at most 10 000 documents - and I don't think that the outer LIMIT would make it to the Lucene request On 18.10.22 13:35, Mikael Pesonen wrote: I have a bigger query that starts with inner select { SELECT ?s ?score WHERE { (?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" ) . } order by desc(?score) offset 0 limit 1000 } There are about 1 results. limit 1000 returns ~560 and limit 100 ~75 results. How do I page results correctly?
Re: SPARQL limit doesn't work
Hi, yes, same select as only query gets exactly limit amount of triples. On 18/10/2022 16.48, Lorenz Buehmann wrote: did you get those results when running only this subquery? Afaik, the default limit of the Lucene text query is at most 10 000 documents - and I don't think that the outer LIMIT would make it to the Lucene request On 18.10.22 13:35, Mikael Pesonen wrote: I have a bigger query that starts with inner select { SELECT ?s ?score WHERE { (?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" ) . } order by desc(?score) offset 0 limit 1000 } There are about 1 results. limit 1000 returns ~560 and limit 100 ~75 results. How do I page results correctly? -- Lingsoft - 30 years of Leading Language Management www.lingsoft.fi Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books Mikael Pesonen System Engineer e-mail: mikael.peso...@lingsoft.fi Tel. +358 2 279 3300 Time zone: GMT+2 Helsinki Office Eteläranta 10 FI-00130 Helsinki FINLAND Turku Office Kauppiaskatu 5 A FI-20100 Turku FINLAND
Re: SPARQL limit doesn't work
did you get those results when running only this subquery? Afaik, the default limit of the Lucene text query is at most 10 000 documents - and I don't think that the outer LIMIT would make it to the Lucene request On 18.10.22 13:35, Mikael Pesonen wrote: I have a bigger query that starts with inner select { SELECT ?s ?score WHERE { (?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" ) . } order by desc(?score) offset 0 limit 1000 } There are about 1 results. limit 1000 returns ~560 and limit 100 ~75 results. How do I page results correctly?
SPARQL limit doesn't work
I have a bigger query that starts with inner select { SELECT ?s ?score WHERE { (?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" ) . } order by desc(?score) offset 0 limit 1000 } There are about 1 results. limit 1000 returns ~560 and limit 100 ~75 results. How do I page results correctly?