I think we have there a performance issue. When the complete result is big, and you take a slice with LIMIT/OFFSET, i suspect the Jena implementation is going through the dataset until the offset, then it get the result. It doesn't take advantage of a previous offset/limit with the same query; so, going through the complete result page by page using LIMIT/OFFSET can become infeasible because the response time augment with the OFFSET value. Is there a best practice about such use case? -- Jean-Claude Moissinac
Le mar. 9 mars 2021 à 10:16, Lorenz Buehmann < buehm...@informatik.uni-leipzig.de> a écrit : > just some comments as you already got the answer: > > pagination in SPARQL can only be done via limit + offset, as you already > have figured out - but, formally, it is only guaranteed to be correct > when sorting the data. > > depending on the size of the data this can be expensive - especially > what people always find strange is hat offset operator in SPARQL is not > as simple as it might be in SQL because of the semantics of SPARQL. > There isn't a "simple" cursor like in SQL database, so it might be > rather slow for large offsets. > > Jena usually doesn't know the result size during query execution as it's > (afaik) using a pipelined execution (aka lazy or Volcano) - only for > operations where it has to have to whole intermediate result computed to > proceed to the next stage (e.,g. aggregates) this assumption holds. > > > long story short: if you really think that a user needs to see all > pages, as already suggest, a count in a separate query before would do it. > > On 09.03.21 09:52, Donald McIntosh wrote: > > Hi.. > > > > I have an implementation where I would like to page through data > retrieved via a SPARQL query on Apache Jena on a UI. offset and limit > features take me some of the way there but do not tell me the full size of > the overall result set so that users can skip to the end or to page x > knowing that it will exist. I am guessing that internally Jena will know > the result set size from a query but perhaps this not available to the > caller, as the full set will have been retrieved and sorted. > > > > Is there a correct and efficient way to implement this type of use case > inApache Jena ? > > > > Thanks, > > Donald >