On 6/16/22 02:59, Marius Grigaitis wrote:
In the end what caught our eye is a few deleteByQuery lines in stacks of
running threads while Solr is overloaded. We temporarily removed
deleteByQuery and it had around 10x performance improvement on indexing
speed.

I do not understand all the low-level interactions.  But I have seen deleteByQuery cause some major problems.  It seems to create a blocking situation where Lucene waits for things to complete before it actually does the delete, and anything sent AFTER the delete waits for the delete.  Imagine this situation:

1) Ongoing indexing begins a segment merge, one that will take 15 minutes to complete.
2) A deleteByQuery is sent.
3) More index changes are sent.

What happens in this situation is that step 2 will wait for the merge to complete, and step 3 will wait for step 2 to complete.  I have seen automatic segment merges that take a lot longer than 15 minutes.

If step 2 is changed to query for ID and then use deleteById, then steps 2 and 3 will run concurrently with the merge.

It took a lot of headscratching to figure out why my indexing process sometimes stalled for LONG time spans.

Thanks,
Shawn

Reply via email to