But to me, it always seems...er...fraught to even *think* about relying on doc ids. I know you've been around the block with Lucene, but do you have a compelling reason to use the doc ID and not your own unique ID?
Best Erick On Tue, Mar 11, 2008 at 5:39 PM, Daniel Noll <[EMAIL PROTECTED]> wrote: > On Tuesday 11 March 2008 19:55:39 Michael McCandless wrote: > > Hi Daniel, > > > > 2.3 should be no different from 2.2 in that docIDs only "shift" when > > a merge of segments with deletions completes. > > > > Could it be the ConcurrentMergeScheduler? Merges now run in the > > background by default and commit whenever they complete. You can get > > back to the previous (blocking) behavior by using > > SerialMergeScheduler instead. > > That was my first thought, but SerialMergeScheduler doesn't cause the > problem. > I've done a little more investigation since; it turns out that if I don't > call optimize() then the problem doesn't occur. > > Could it be that optimize(int,boolean) is storing the segments to optimise > in > a HashSet, which by its nature reorders the segments? > > > If it's not that ... can you provide more details about how your > > applications is relying on docIDs? > > As far as that, we assume that if there are N documents in the index then > the > next document ID will be N (we determine this before adding the document.) > As we're only doing this in a single thread and we never delete documents, > this was previously safe. > > Daniel > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >