Re: [jira] [Commented] (JENA-626) SPARQL Query Caching

Andy Seaborne Tue, 10 Feb 2015 06:46:29 -0800

On 03/02/15 11:13, Saikat Maitra wrote:

Hello Andy,


I have build a prototype version for the SPARQL Query Cache implementation.

I have provided the implementation details below:

1. Created CacheStore class with utility operation as doGet, doSet and
doUnset cache.
2. Created CacheClient interface and implemented client for local in memory
cache.
3. Created CacheAction class with enum fields as READ_CACHE and WRITE_CACHE.
4. Created Cache class as a wrapper object to hold the Cache result(Json
result as of now) and SPARQLResultSet
5.  I check in SPARQLQuery that if cache is null or is it  initialised.
6. If it is null I set WRITE_CACHE action and pass cache object to
ResponseResultSet.
7. ResponseResultSet creates a StringBuilder object and pass it to
IndentedWriter.
8. As Query results are iterated and written in ServletOut Stream I also
append the data in StringBuilder object.
9. Before flushing the data to Outstream I store the StringBuilder object
which contain the json result in cache and set data in cache object has
been initialised.
10. If CacheStore already contain the data then I retrieve the data from
cache and write it to ServletOutStream and flush the data.

Here are the code details in my jena fork.

https://github.com/samaitra/jena/tree/master/jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/cache

https://github.com/samaitra/jena/blob/master/jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/servlets/ResponseResultSet.java

https://github.com/samaitra/jena/blob/master/jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/servlets/SPARQL_Query.java


Please let me know your feedback.

Currently I have tested the implementation with Ask and Select Queries. I
still need to test it for Construct and Describe Queries. I will also need
to make modification for returning thrift response.

Regards

Saikat


Hi Saikat,

A few comments and questions from looking at the code.

1/ At the moment the cache is deeply integrated into SPARQL_Query. Wouldit be practical to have it as a separate service endpoint that invokesSPARQL_Query? There could be different caches for each service, andresources allocated appropriately.

2/ I found calling the individual cache entry a "Cache" object a bitconfusing. "Cache" is usually (to me) the whole thing. Would"CacheEntry" be a better name?

3/ CacheBuilder - is the idea where to cache the HTTP response as astring, or rather the last response as the right content type? The mostbenefit comes from not executing the query - I'm not sure that worryingabout converting the results to send is the big win.

4/ The thing cached is a "SPARQLResult" object from executeQuery.Unfortunately, in the case of SELECT queries, ResultSets from query are"use once" (they are iterators) so isn't the second time they are usedalways going to be empty results? I think the code needs to take a copyof the results and I didn't see any copy being taken. Did I miss it?

If a copy is needed, and to support disk usage, using the Thrift format(when not an in-memory cache) should be fastest.


5/ How do cache entries get invalidated when an update happens?

6/ CacheAction - is this "work in progress" in some way. I wasn't ableto see what it was for.


        Andy

Re: [jira] [Commented] (JENA-626) SPARQL Query Caching

Reply via email to