Hi Karl,
I have seen this and have always thought I should spend some time on
it, but then didn't get to it. That isn't to say it isn't useful. I
think one thing I wonder about is if there is a way it could be a
standalone contrib package or maybe there is a way to separate out
the interface changes from the InstantiatedIndex stuff? That way you
could lobby for InstIndex as a contrib, and then a separate patch for
the API changes. And please feel free to tell me they can't, I am
just wondering out loud here trying to find a path to take so it
isn't lost.
I think there are some reasons Document is final, although I am not
sure they can't be handled through a buyer beware issue. If you
search the archives for Document and final I think you will see the
arguments. There is also an issue in JIRA related to it (https://
issues.apache.org/jira/browse/LUCENE-778) so you are not the only one
asking for it (I see you commented on that one)
By the looks of the issue, you had a lot of comments and good input,
do you feel all the issues have all been addressed? Just asking...
Also, does Mike M's changes affect how you would do these things?
Mostly just me trying to figure out this patch. I, too, would hate
to see it whither, but I can't make any promises on time, either. By
the way, the Flexible Indexing stuff from Nicolas, et. al is in this
same boat in my mind. Would love to have 'em in Lucene, but don't
have the cycles to do it. Sigh.
-Grant
On Jul 26, 2007, at 3:56 PM, karl wettin wrote:
Some time ago I tried to introduce LUCENE-581, a new consumer top
layer, the core changes required by LUCENE-550, my
InstantaitedIndex. I would still like to see this a part of the
core. It is completely backwards compatible but contains a few
small changes that seems to be convtroversial, and I'm honestly not
sure why:
* Complete definalization of Term, Document and IndexReader.
* IndexWriterInterface
In my eyes, the only thing these things do are to limit Lucene
development to the file-centric Directory store. There is nothing
wrong with Dicretory, I just want to be able to use the same code
for any store design of my chooise. I want unison index handling,
no matter the implementation. One line of code that switch between
Directory, BDB, MemoryIndex, InstantiatedIndex or what not.
This post is about InstantiatedIndex and the things I built upon
it. As time it passed I just gave up on keeping them up to date. It
is in use at this one place where it is just spinning on with no
need to update, stuck to Lucene 2.0 or so. We are now getting close
to Lucene 3.0 and I would hate to see this code get lost in time.
It has so many neat features. Beeing really really fast on small
corpuses is just one.
In essense the design is similar to contrib/MemoryIndex, but it can
hold multiple documents.
The definalization and interface also allows for index insert/
delete/optimization notifications.
These two features combined yeilded in an active cache (not really
used in any project, just a proof-of-concept I experimented with on
a site where a lot of users place the exact same query) that update
cached results only when affected by new data. Could be done with
MemoryIndex too, but not as fast as InstantiatedIndex can handle
batches of documents.
One can however do alot of other things with it.
In LUCENE-626 I also use InstantiatedIndex, getting some 10-20
times faster response times from my contrib/spellcheck augmentation
than when using a RAMDirectory.
There are more features and potentially cool things one might want
to consider in the 550-patch/UML diagram.
Would the changes to the core InstantiatedIndex require ever be
committed? Then I could sit down and bring these patches up to
date. Otherwise I'll just let them become some depricated artifact
I use for a couple of things such as spellchecking, rather than a
neat augmentation of Lucene I could use for any future development.
--
karl
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org/tech/lucene.asp
Read the Lucene Java FAQ at http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]