> Reading that trail, I wish the original poster gave up on his idea ( Err, that should have read..
"Reading that trail, I wish the original poster hadn't given up on his idea" On Thu, Jan 14, 2010 at 2:23 AM, Babak Farhang <farh...@gmail.com> wrote: > Hi, > > I've been thinking about how to update a single field of a document > without touching its other fields. This is an old problem and I was > considering a solution along the lines of Andrzej Bialecki's post to > the dev list back in '07: > > > <quote http://markmail.org/message/tbkgmnilhvrt6bii > > > I have the following scenario: I want to use ParallelReader to > maintain parts of the index that are changing quickly, and where > changes are limited to specific fields only. > > Let's say I have a "main" index (many fields, slowly changing, large > updates), and an "aux" index (fast changing, usually single doc and > single field updates). I'd like to "replace" documents in the "aux" > index - that is, delete one doc and add another - but in a way that > doesn't change the internal document numbers, so that I can keep the > mapping required by ParallelReader intact. > > I think this is possible to achieve by using a FilterIndexReader, > which keeps a map of updated documents, and re-maps old doc ids to the > new ones on the fly. > > From time to time I'd like to optimize the "aux" index to get rid of > deleted docs. At this time I need to figure out how to preserve the > old->new mapping during the optimization. > > So, here's the question: is this scenario feasible? If so, then in the > trunk/ version of Lucene, is there any way to figure out (predictably) > how internal document numbers are reassigned after calling optimize() > ? > > </quote> > > > Reading that trail, I wish the original poster gave up on his idea ( > http://markmail.org/message/tbkgmnilhvrt6bii#query:+page:1+mid:kn77zpiu43kd2ufn+state:results > ) > > > <quote> > Thanks for the input - for now I gave up on this, after discovering > that I would have no way to ensure in TermDocs.skipTo() that document > id-s are monotonically increasing (which seems to be a part of the > contract). > </quote> > > I imagine if Andrzej's proposed FilterIndexReader maintains 2 sorted > (ordered) maps, one from internal document-ids to "view" document-ids, > and another mapping from "view" document-ids to internal > document-ids, then things like skipTo() can be implemented reasonably > efficiently. Only the mapped ids are maintained in these structures. > (Also note that a mapped "view" document-id represents an internally > deleted document with that id.) > > And if we can find a way to merge the segments of this "aux" index > along whenever the segments of its associated "main" index are merged > or optimized (such that the [internal] doc-ids in the merged aux index > end up getting sync'ed with those of the trunk), then there shouldn't > be all that many doc-ids to map anyway (if we merge frequently > enough). > > So to go back Andrzej's question: is there any way to figure out > (predictably) how internal document numbers [in the main index] are > reassigned after calling optimize() ? How does LUCENE-847, as Doug > Cutting suggests in that trail, help? > > Sorry if that was long winded, had to start somewhere ;) > > -Babak > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org