The difference is as Andy implied in the protocol definition. SPARQL as a protocol, is tied to HTTP, and does not expose any notion of cursors as a traditional RDBMS would. Once ARQ passes off a query to a remote SPARQL service it has no control over how that query is executed by the remote service.
But yes you could probably implement a full blown Jena client side cursor implementation layered over the basic implementation. Internally ARQ itself does streaming execution. So when Andy refers to top-k what he means is ARQ allocates an internal buffer of k results (LIMIT + OFFSET) when k is below a certain threshold and as it executes the query keeps only the top K results necessary to answer it. However this does still require the entire query to be evaluated i.e. you still have to find and sort all possible results, even if only in reference to the top k currently stored. At a Jena client layer you can already execute the query without LIMIT/OFFSET (by removing those clauses from the parsed query), and then translating the results into a ResultSetRewindable [1][2] and paging through that in your own code. But depending on the query (and dataset) you may be trading off memory for time. Rob [1] https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/query/ResultSetFactory.html#makeRewindable(org.apache.jena.rdf.model.Model) [2] https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/query/ResultSetRewindable.html On 13/05/2021, 23:00, "graham" <gra...@orangedogsoftware.com> wrote: Hi I am a little confused by this discussion. I understand the original posters question -- what they are talking about is an incredibly common use case. Run query, look at the first 1-200 results, add filters, re-run new query, repeat and rinse. If this were JDBC the client would run one query with no limits/offsets, so in this example the query with the results ordered by date. The client UI would then implement the paging itself. The client would cache past pages, so that users can page backward, and forward paging is just standard JDBC result streaming. Admittedly if you were clever you could also do a dance with a scrolling result set, although personally I don't find them all that useful. What isn't clear to me is why you can't do the same thing with SPARQL? In other words I am unclear where the implementation issues with SPARQL/Jena are occurring. thanks graham I don't know what Jena's policy regarding diverging from the SPARQL standards are, but On 14/05/21 7:32 am, Andy Seaborne wrote: > On 11/05/2021 16:54, Kimball, Adam wrote: >> I know that I’ve asked this question before, but I am still >> struggling to understand how I might handle this case: >> >> I have a Jena DB of event entries. One common way to view the events >> is to page through them. Normally this is done by seeing the most >> recent 50 events and then paging to the next 50 most recent and so on. >> >> In pure SPARQL, I don’t really see an efficient way to accomplish >> this. With limit and offset, I don’t really save anything other than >> i/o since the whole result set will need to be ordered before this >> limit/offset has an effect. And that is killing us now. >> >> My guess is we will need to implement some caching or possibly index >> the graph with Lucene or something. It is doable but definitely not >> ideal. Maybe I can use the quad position to facilitate this? I am >> assuming this cannot be optimized within Jena itself? >> >> Best, >> Adam >> >> > > Hi Adam, > > No - there isn't a better way in std SPARQL. If you think the app is > going to process all the results, reading the whole thing into some > local cache is a way to go. > > The proper solution is a overhaul of the SPARQL protocol. > > Also, HTTP/2 may offer some iteresting possibilities. > > Specific to ARQ: query execution is often predictable and stable > order. There aren't many places where - absent concurrent updates - > the order will be different from call to call. > > FWIW Jena does optimize "top k" sorts SELECT-sort-LIMIT/OFFSET up to > (from memory) k=1000 items. > > > Maybe I can use the quad position to facilitate this? > > Not sure what the idea is here. > > Andy > > -- Doubt is a pain too lonely to know that faith is his twin brother. - Kahlil Gibran