GitHub user osma opened a pull request:

    https://github.com/apache/jena/pull/119

    JENA-999: jena-text Lucene cache using multimaps

    This set of commits implements a caching layer for Lucene queries. The 
cache is stored in the Context so that it is persisted even when new 
TextQueryPF's are created. Cache entries for query results are Guava Multimaps, 
which allow efficient lookups of known subject URIs in the case where the 
subject is already bound.
    
    @afs I hope I did the Context storage right. You said it will have the 
right lifetime and I hope that's true since otherwise memory leaks may occur. I 
looked at Stephen Allen's example from the jena-text-cache experimental branch: 
https://github.com/apache/jena/commit/45081fabe012c56b3fc7ae6a92b4518245779eb2
    
    I have verified that this gives good performance with Stephen's example 
queries, even in the UNION case where TextQueryPF is recreated over and over. 
For example, a query with 11,111 results is answered in less than 300 ms.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/osma/jena jena-text-lucene-cache

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/jena/pull/119.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #119
    
----
commit a7bb1094a1750492c290d03ad3957d8fe42d4e2c
Author: Osma Suominen <[email protected]>
Date:   2015-12-22T16:45:50Z

    very simple caching of Lucene query results in a hash map

commit af302e2b5cfa3ff2db9e1901dc36df547b1c4bad
Author: Osma Suominen <[email protected]>
Date:   2016-01-05T20:05:31Z

    move Lucene query cache to Context for some persistence

commit b54e38bc00cfa3ddbb3969c4d8fb1efe658af9ea
Author: Osma Suominen <[email protected]>
Date:   2016-01-05T20:07:24Z

    remove unused import

commit 718d275a7c5f160a0050ba392fdc1affadea093a
Author: Osma Suominen <[email protected]>
Date:   2016-01-05T20:34:02Z

    store Multimaps in the cache for more efficient retrieval of known subject 
URIs

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to