I think we have there a performance issue.
When the complete result is big, and you take a slice with LIMIT/OFFSET, i
suspect the Jena implementation is going through the dataset until the
offset, then it get the result. It doesn't take advantage of a previous
offset/limit with the same query; so, going through the complete result
page by page using LIMIT/OFFSET can become infeasible because the response
time augment with the OFFSET value.
Is there a best practice about such use case?
--
Jean-Claude Moissinac



Le mar. 9 mars 2021 à 10:16, Lorenz Buehmann <
buehm...@informatik.uni-leipzig.de> a écrit :

> just some comments as you already got the answer:
>
> pagination in SPARQL can only be done via limit + offset, as you already
> have figured out - but, formally, it is only guaranteed to be correct
> when sorting the data.
>
> depending on the size of the data this can be expensive - especially
> what people always find strange is hat offset operator in SPARQL is not
> as simple as it might be in SQL because of the semantics of SPARQL.
> There isn't a "simple" cursor like in SQL database, so it might be
> rather slow for large offsets.
>
> Jena usually doesn't know the result size during query execution as it's
> (afaik) using a pipelined execution (aka lazy or Volcano) - only for
> operations where it has to have to whole intermediate result computed to
> proceed to the next stage (e.,g. aggregates) this assumption holds.
>
>
> long story short: if you really think that a user needs to see all
> pages, as already suggest, a count in a separate query before would do it.
>
> On 09.03.21 09:52, Donald McIntosh wrote:
> > Hi..
> >
> > I have an implementation where I would like to page through data
> retrieved via a SPARQL query on Apache Jena on a UI.  offset and limit
> features take me some of the way there but do not tell me the full size of
> the overall result set so that users can skip to the end or to page x
> knowing that it will exist.  I am guessing that internally Jena will know
> the result set size from a query but perhaps this not available to the
> caller, as the full set will have been retrieved and sorted.
> >
> > Is there a correct and efficient way to implement this type of use case
> inApache Jena ?
> >
> > Thanks,
> > Donald
>

Reply via email to