[
https://issues.apache.org/jira/browse/LUCENE-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shai Erera updated LUCENE-3441:
-------------------------------
Attachment: LUCENE-3441.patch
Patch introduces NRT support by doing the following:
* Add a constructor which takes DirTaxoWriter, from which DirTaxoReader obtains
the internal IndexWriter instance, to obtain NRT readers.
* Remove refresh() in exchange for a static TaxonomyReader.openIfChanged.
Similar to DirectoryReader, the method either returns null if no changes were
made to the taxonomy, or a new TR instance otherwise.
* Extracted the logic of creating the ParentArray and ChildrenArrays from
DirTaxoReader into their own classes. As a result:
** DirTaxoReader code greatly simplified
** These classes are now immutable, which simplified even more the logic of
DirTaxoReader.
* TaxonomyReader made abstract class instead of an interface, and few methods
(e.g. close(), incRef(), decRef() etc.) were pulled to it from DirTaxoReader
and made final.
Not strictly related, but I could not avoid these changes too:
* Removed the over-verbosing in DirTaxoReader. Some is unnecessary anymore b/c
DirTaxoReader is simplified, other was just too much IMO.
* Improved the documentation of the different methods, again mostly by
shortening them and keep them focused.
NOTE: I put a CHANGES entry under the back-compat section of 4.1. I intend to
commit this to 4.x, and it is sort of a back-compat break, even though a simple
one.
There's one nocommit which I'd love if someone can take a look at and perhaps
propose a solution. I documented it there, but I'll repeat the issue here -
DirTaxoReader maintains two LRU caches which I'd like to share with the new
instance returned from openIfChanged. Currently the code copies them fully,
which is not so efficient in an NRT case.
While I could just share the instance, I'm worried that two TR instances have
e.g. the ability to change the cache size, or add/remove entries from it.
Also note the weird behavior I mentioned about cloning the cache, as opposed to
add it all to a new instance. I still didn't get to the bottom of why cloning
the cache is so horribly slow, but adding it to a fresh new instance is so
cheap ...
> Add NRT support to LuceneTaxonomyReader
> ---------------------------------------
>
> Key: LUCENE-3441
> URL: https://issues.apache.org/jira/browse/LUCENE-3441
> Project: Lucene - Core
> Issue Type: New Feature
> Components: modules/facet
> Reporter: Shai Erera
> Assignee: Shai Erera
> Priority: Minor
> Attachments: LUCENE-3441.patch
>
>
> Currently LuceneTaxonomyReader does not support NRT - i.e., on changes to
> LuceneTaxonomyWriter, you cannot have the reader updated, like
> IndexReader/Writer. In order to do that we need to do the following:
> # Add ctor to LuceneTaxonomyReader to allow you to instantiate it with
> LuceneTaxonomyWriter.
> # Add API to LuceneTaxonomyWriter to expose its internal IndexReader
> # Change LTR.refresh() to return an LTR, rather than void. This is actually
> not strictly related to that issue, but since we'll need to modify refresh()
> impl, I think it'll be good to change its API as well. Since all of facet API
> is @lucene.experimental, no backwards issues here (and the sooner we do it,
> the better).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]