Andy, yes. We don't use Fuseki, but have a similar 3-tier architecture. I guess my issue was more that the core DB should not employ an implicit cache which is uncontrollable for its clients, but from your explanation below, I think we are on the same page. In fact, we also use etags and augment "sparql" by having query parameters for cached paging. The only difference is that we move cached paging entirely outside of sparql, E.g.
we have something like POST ...?query&pageSize=... (where the query is in the body) and the answer provides a unique token which can be used to browse the pages until they expire, e.g. GET ....?token=<myToken>&pageSize=...&page=... If our clients do not want the caching behavior, we tell them to use OFFSET/LIMIT instead. Simon From: Andy Seaborne <[email protected]> To: [email protected] Date: 08/08/2011 10:36 AM Subject: Re: SPARQL queries and paging On 08/08/11 14:35, Simon Helsen wrote: > On this topic, I'd like to point out that we have a separate outside > mechanism for paging which behaves like you suggest, however, the > difference with OFFSET/LIMIT is that the next time someone makes the query > e.g. to obtain the next page, we do expect the query to be recalculated > since the state of the store may have changed. > > So, if you plan to change the behavior by introducing a caching model, you > may actually alter the behavior unless you are able to determine that a > subsequent execution of a query would not have changed results (e.g. by > having the actions isolated in a transaction?) > > Simon Transactions, or just Fuseki noting updates (language or grpah store protocol), can be used to give a version id to each state and then ETags can be used to drive cache invalidation. It's a three-layer model: client SPARQL cache core DB. ETags is between SPARQL cache and code DB. The protocol between each layer is the SPARQL protocol. The SPARQL cache can keep whole result sets for pseudo paging using ORDER/OFFSET/LIMIT with different policies on The protocol between each layer is the SPARQL protocol but it coudl also augment the SPARQL protocol with parameters like ?page= or ?liveness=uselastquery for better control (in addition to trying to intuit from requests and version ids). Being able to set different consistency/cache efficiency tradeoffs in client of cache server might be useful. Andy
