Github user osma commented on the pull request:
https://github.com/apache/jena/pull/95#issuecomment-213564399
@ajs6f said:
> If Jena is going to do really useful caching, it's going to be caching
something other than bits, something specific to what Jena does, like
node-tuples or result rows or the like.
There are indeed some things a Jena-specific cache can do better than a
general purpose cache such as Varnish. Many have already been mentioned, but to
recap:
1. The cache can be aware of changes in the underlying data and keep
responses "forever", but invalidate them immediately when data changes. An
external cache has to be manually purged. (Supporting ETag, Last-Modified,
conditional GETs etc. would help automatically invalidate an external cache
though)
2. If the cache is on a ResultSet level, things like providing different
serializations for the same result and paging become possible.
3. The cache can be aware of the cost of executing the query. Holding on to
a result that can be recalculated from the data in 5ms is basically a waste of
memory. Perhaps the cache, when it has to evict older results, could prefer
keeping results that took a long time to generate. Or there could be a
threshold that determines what to cache - e.g. don't keep anything that was
generated in less than 10ms.
4. An external cache is always more infrastructure to set up and maintain,
and configuring a general purpose HTTP proxy is not exactly trivial. Even if I
intend to use Varnish in production, this kind of integrated cache could be
very useful in development and testing. Just a single setting could give a
noticeable performance boost to Fuseki.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---