[GitHub] jena pull request: JENA-999: jena-text Lucene cache using multimap...

osma Sat, 09 Jan 2016 01:19:27 -0800

Github user osma commented on the pull request:

    https://github.com/apache/jena/pull/119#issuecomment-170214419
  
    I have now implemented the changes. I went for a 10-slot Atlas cache, which 
is a LRU cache in my understanding though the decision is left to the 
CacheFactory. I think it's big enough that it's unlikely to be ever filled by a 
single SPARQL query (I can imagine having, say, 2 text queries each being fed 4 
different properties, which could fill 8 slots, but not much more), but small 
enough to not use huge amounts of memory - if a TextHit object within a 
Multimap takes around 100-200 bytes (very rough estimate) and a text query 
normally returns at most 10000 results, then the cache could take at most 
around 10-20MB (1-2MB per entry) but usually a lot less than that.
    
    In the last commit (32c1c13) I switched to `getOrFill` to handle cache 
misses. I'm not sure if it improves clarity. The number of lines of code 
remains the same and to me, it's not as evident when the code within the lambda 
expression is actually executed as with the explicit `if (results == null)` 
block. Thoughts?
    
    Other than the style issue, I'd like to merge this soon and move on with 
other jena-text things that I have in the pipeline :)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] jena pull request: JENA-999: jena-text Lucene cache using multimap...

Reply via email to