[ 
https://issues.apache.org/jira/browse/LUCENE-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500263#comment-13500263
 ] 

Shai Erera commented on LUCENE-3441:
------------------------------------

bq. why not have a single instance of the LRUCache for all time, and never call 
.clear() on it?

That will help as long as previous TR instances are indeed on their way to die. 
Otherwise, if e.g. an app, for some reason, reopens a TR but doesn't close the 
old one and uses both (again, for some really unknown reason), then two TR 
instances might affect each other.

Now, since that's a very stupid thing to do, I'm not sure that I care about 
this much, as long as we preserve correctness. Meaning, that that one instance 
may reduce the size of the cache, while another increases it - that's the app 
problem. That that the two instances might evict entries from the LRU cache 
left and center, that's the app problem.

The correctness issues that I'm worried about is (suppose that TR-1 and TR-2 
share the same instance):
* TR-1 looks for a category "foo", doesn't find it and adds to the cache the 
fact that the category is unknown
* TR-2 looks for the category "foo", which exists in its newer version of the 
taxonomy, and receives the ordinal -1, which denotes that the category doesn't 
exist --- WRONG !!

To solve that, we could not store the fact that a category does not exist in 
the cache. Really, this shouldn't happen - apps do not ask the taxonomy for 
random categories. So then:

* TR-1 looks for a category "foo", doesn't find it in the cache and DOES NOT 
update the cache w/ that info. It goes to disk, doesn't find it there, returns 
-1.
* TR-2 looks for the category "foo", which exists in its newer version of the 
taxonomy, fetches it from disk and stores the ordinal in the cache.
* TR-1 looks for the category "foo" again, now receives an ordinal which is 
larger than its taxonomy size --- might be a problem !!

In general, since I don't think that apps access the taxonomy for random 
ordinals or categories, the second solution might be good. Never store in the 
cache the fact that an ordinal/category is not found + don't clear() the cache, 
only nullify its reference + hope for the best :)?
                
> Add NRT support to LuceneTaxonomyReader
> ---------------------------------------
>
>                 Key: LUCENE-3441
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3441
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>         Attachments: LUCENE-3441.patch
>
>
> Currently LuceneTaxonomyReader does not support NRT - i.e., on changes to 
> LuceneTaxonomyWriter, you cannot have the reader updated, like 
> IndexReader/Writer. In order to do that we need to do the following:
> # Add ctor to LuceneTaxonomyReader to allow you to instantiate it with 
> LuceneTaxonomyWriter.
> # Add API to LuceneTaxonomyWriter to expose its internal IndexReader
> # Change LTR.refresh() to return an LTR, rather than void. This is actually 
> not strictly related to that issue, but since we'll need to modify refresh() 
> impl, I think it'll be good to change its API as well. Since all of facet API 
> is @lucene.experimental, no backwards issues here (and the sooner we do it, 
> the better).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to