On Tue, Feb 16, 2010 at 10:33 AM, Alexander Klimetschek <[email protected]> wrote: > On Mon, Feb 15, 2010 at 13:28, Marcel Reutegger > <[email protected]> wrote: >> On Fri, Feb 12, 2010 at 14:47, Alexander Klimetschek <[email protected]> >> wrote: >>> On Fri, Feb 12, 2010 at 13:33, Marcel Reutegger >>> <[email protected]> wrote: >>>> jackrabbit does it in a similar way for quite some time now. >>> >>> To me it sounds like this partial-temporary-indexing feature should be >>> part of Lucene directly (configurable, of course). >> >> well, it's not that easy. jackrabbit makes use of many assumptions and >> implementation specific properties of the content that is indexed. >> e.g. nodes are uniquely identifiable and it is not required to >> immediately persist the index on commit. it is sufficient that a redo >> log contains enough information to replay the changes. all this cannot >> be moved easily into a more generic library like lucene. however there >> is interesting work going on with the near-real-time index that we >> might want to use in the future. > > I see. The near-real-time index sounds great (however, "real-time" > always has to be taken carefully ;-)).
I scanned http://code.google.com/p/zoie/, and although not totally clear from the documentation, I assume indeed that they have, as Marcel points out, something similar to Jackrabbit's indexing strategy, namely readonly multi index reader + one in memory index. Afaik, it is also similar to [1], lucene Ocean Real Time Search. As the current implementation in jr already has 'read only' indexes, I doubt whether the gain of Lucene 2.9 will be that high. A good paper on the changes by the way can be found here [2] (what is new in 2.9). What I do think we can benefit on largely is triranges, as currently range queries on for example dates are really expensive Regards Ard [1] http://wiki.apache.org/lucene-java/OceanRealtimeSearch [2] http://www.lucidimagination.com/solutions/whitepapers > > Regards, > Alex > > -- > Alexander Klimetschek > [email protected] >
