GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/119
JENA-999: jena-text Lucene cache using multimaps
This set of commits implements a caching layer for Lucene queries. The
cache is stored in the Context so that it is persisted even when new
TextQueryPF's are created. Cache entries for query results are Guava Multimaps,
which allow efficient lookups of known subject URIs in the case where the
subject is already bound.
@afs I hope I did the Context storage right. You said it will have the
right lifetime and I hope that's true since otherwise memory leaks may occur. I
looked at Stephen Allen's example from the jena-text-cache experimental branch:
https://github.com/apache/jena/commit/45081fabe012c56b3fc7ae6a92b4518245779eb2
I have verified that this gives good performance with Stephen's example
queries, even in the UNION case where TextQueryPF is recreated over and over.
For example, a query with 11,111 results is answered in less than 300 ms.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/osma/jena jena-text-lucene-cache
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/jena/pull/119.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #119
----
commit a7bb1094a1750492c290d03ad3957d8fe42d4e2c
Author: Osma Suominen <[email protected]>
Date: 2015-12-22T16:45:50Z
very simple caching of Lucene query results in a hash map
commit af302e2b5cfa3ff2db9e1901dc36df547b1c4bad
Author: Osma Suominen <[email protected]>
Date: 2016-01-05T20:05:31Z
move Lucene query cache to Context for some persistence
commit b54e38bc00cfa3ddbb3969c4d8fb1efe658af9ea
Author: Osma Suominen <[email protected]>
Date: 2016-01-05T20:07:24Z
remove unused import
commit 718d275a7c5f160a0050ba392fdc1affadea093a
Author: Osma Suominen <[email protected]>
Date: 2016-01-05T20:34:02Z
store Multimaps in the cache for more efficient retrieval of known subject
URIs
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---