Re: Solr indexing performance tips

Shawn Heisey Thu, 16 Jun 2022 07:38:18 -0700

On 6/16/22 02:59, Marius Grigaitis wrote:

In the end what caught our eye is a few deleteByQuery lines in stacks of
running threads while Solr is overloaded. We temporarily removed
deleteByQuery and it had around 10x performance improvement on indexing
speed.

I do not understand all the low-level interactions. But I have seendeleteByQuery cause some major problems. It seems to create a blockingsituation where Lucene waits for things to complete before it actuallydoes the delete, and anything sent AFTER the delete waits for thedelete. Imagine this situation:

1) Ongoing indexing begins a segment merge, one that will take 15minutes to complete.

2) A deleteByQuery is sent.
3) More index changes are sent.

What happens in this situation is that step 2 will wait for the merge tocomplete, and step 3 will wait for step 2 to complete. I have seenautomatic segment merges that take a lot longer than 15 minutes.

If step 2 is changed to query for ID and then use deleteById, then steps2 and 3 will run concurrently with the merge.

It took a lot of headscratching to figure out why my indexing processsometimes stalled for LONG time spans.


Thanks,
Shawn

Re: Solr indexing performance tips

Reply via email to