Re: Long blocking during indexing + deleteByQuery

Michael McCandless Wed, 08 Nov 2017 03:24:17 -0800

I'm not sure this is what's affecting you, but you might try upgrading to
Lucene/Solr 7.1; in 7.0 there were big improvements in using multiple
threads to resolve deletions:
http://blog.mikemccandless.com/2017/07/lucene-gets-concurrent-deletes-and.html


Mike McCandless

http://blog.mikemccandless.com

On Tue, Nov 7, 2017 at 2:26 PM, Chris Troullis <cptroul...@gmail.com> wrote:

> @Erick, I see, thanks for the clarification.
>
> @Shawn, Good idea for the workaround! I will try that and see if it
> resolves the issue.
>
> Thanks,
>
> Chris
>
> On Tue, Nov 7, 2017 at 1:09 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
> > bq: you think it is caused by the DBQ deleting a document while a
> > document with that same ID
> >
> > No. I'm saying that DBQ has no idea _if_ that would be the case so
> > can't carry out the operations in parallel because it _might_ be the
> > case.
> >
> > Shawn:
> >
> > IIUC, here's the problem. For deleteById, I can guarantee the
> > sequencing through the same optimistic locking that regular updates
> > use (i.e. the _version_ field). But I'm kind of guessing here.
> >
> > Best,
> > Erick
> >
> > On Tue, Nov 7, 2017 at 8:51 AM, Shawn Heisey <apa...@elyograg.org>
> wrote:
> > > On 11/5/2017 12:20 PM, Chris Troullis wrote:
> > >> The issue I am seeing is when some
> > >> threads are adding/updating documents while other threads are issuing
> > >> deletes (using deleteByQuery), solr seems to get into a state of
> extreme
> > >> blocking on the replica
> > >
> > > The deleteByQuery operation cannot coexist very well with other
> indexing
> > > operations.  Let me tell you about something I discovered.  I think
> your
> > > problem is very similar.
> > >
> > > Solr 4.0 and later is supposed to be able to handle indexing operations
> > > at the same time that the index is being optimized (in Lucene,
> > > forceMerge).  I have some indexes that take about two hours to
> optimize,
> > > so having indexing stop while that happens is a less than ideal
> > > situation.  Ongoing indexing is similar in many ways to a merge, enough
> > > that it is handled by the same Merge Scheduler that handles an
> optimize.
> > >
> > > I could indeed add documents to the index without issues at the same
> > > time as an optimize, but when I would try my full indexing cycle while
> > > an optimize was underway, I found that all operations stopped until the
> > > optimize finished.
> > >
> > > Ultimately what was determined (I think it was Yonik that figured it
> > > out) was that *most* indexing operations can happen during the
> optimize,
> > > *except* for deleteByQuery.  The deleteById operation works just fine.
> > >
> > > I do not understand the low-level reasons for this, but apparently it's
> > > not something that can be easily fixed.
> > >
> > > A workaround is to send the query you plan to use with deleteByQuery as
> > > a standard query with a limited fl parameter, to retrieve matching
> > > uniqueKey values from the index, then do a deleteById with that list of
> > > ID values instead.
> > >
> > > Thanks,
> > > Shawn
> > >
> >
>

Re: Long blocking during indexing + deleteByQuery

Reply via email to