RE: Sequence IDs for NRT deletes

Uwe Schindler Tue, 20 Jul 2010 10:10:11 -0700

> The biggest downside of sequence IDs is increase RAM usage right?  Ie,
today
> each deletion takes 1 bit, but with sequence IDs it's 32X bigger (an int),
I
> think?  Are there other downsides?


It only takes this much space for in-ram deletes of in-ram segments. On disk
the deletes get bits again and also for already committed segments. That is
what Michael told me in Berlin.

> Then, checking if a doc is deleted becomes an int compare instead of a bit
> lookup, right?  And, we don't have to clone the deletions during reopen.
> 
> So this is an appropriate tradeoff for apps that need to reopen after
every
> change to the index.  But for apps reopening less often (eg maybe up to
10X
> per second), this may not be a good tradeoff (ie they are willing to spend
> more time in the reopen if it reduces RAM footprint).  Maybe the deletes
> impl should be pluggable and apps can pick...
> 
> Mike
> 
> On Tue, Jul 20, 2010 at 12:33 PM, Jason Rutherglen
> <[email protected]> wrote:
> > Michael B and I have been discussing the per segment doc writers and
> > RT patches/branch. A small improvement we can add to trunk from this
> > is the sequence IDs for deletes, which would improve the existing NRT
> > system by avoiding the cloning of bit vectors.
> > Implementing segment deleted docs via sequence IDs would additionally
> > provide a path way for the future RT branch merge into trunk. It could
> > be best to break up the RT patches as much as possible as they touch
> > on many parts of the Lucene IndexWriter subsystem.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected] For
> > additional commands, e-mail: [email protected]
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected] For additional
> commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

RE: Sequence IDs for NRT deletes

Reply via email to