Thanks for explaining that Shawn!
Emir, I use php library called solarium to do updates/deletes to solr. The 
request is sent to any of the available nodes in the cluster.

> On May 7, 2018, at 5:02 PM, Shawn Heisey <apa...@elyograg.org> wrote:
> 
>> On 5/7/2018 5:05 PM, Jay Potharaju wrote:
>> There are some deletes by query. I have not had any issues with DBQ,
>> currently have 5.3 running in production.
> 
> Here's the big problem with DBQ.  Imagine this sequence of events with
> these timestamps:
> 
> 13:00:00: A commit for change visibility happens.
> 13:00:00: A segment merge is triggered by the commit.
> (It's a big merge that takes exactly 3 minutes.)
> 13:00:05: A deleteByQuery is sent.
> 13:00:15: An update to the index is sent.
> 13:00:25: An update to the index is sent.
> 13:00:35: An update to the index is sent.
> 13:00:45: An update to the index is sent.
> 13:00:55: An update to the index is sent.
> 13:01:05: An update to the index is sent.
> 13:01:15: An update to the index is sent.
> 13:01:25: An update to the index is sent.
> {time passes, more updates might be sent}
> 13:03:00: The merge finishes.
> 
> Here's what would happen in this scenario:  The DBQ and all of the
> update requests sent *after* the DBQ will block until the merge
> finishes.  That means that it's going to take up to three minutes for
> Solr to respond to those requests.  If the client that is sending the
> request is configured with a 60 second socket timeout, which inter-node
> requests made by Solr are by default, then it is going to experience a
> timeout error.  The request will probably complete successfully once the
> merge finishes, but the connection is gone, and the client has already
> received an error.
> 
> Now imagine what happens if an optimize (forced merge of the entire
> index) is requested on an index that's 50GB.  That optimize may take 2-3
> hours, possibly longer.  A deleteByQuery started on that index after the
> optimize begins (and any updates requested after the DBQ) will pause
> until the optimize is done.  A pause of 2 hours or more is a BIG problem.
> 
> This is why deleteByQuery is not recommended.
> 
> If the deleteByQuery were changed into a two-step process involving a
> query to retrieve ID values and then one or more deleteById requests,
> then none of that blocking would occur.  The deleteById operation can
> run at the same time as a segment merge, so neither it nor subsequent
> update requests will have the significant pause.  From what I
> understand, you can even do commits in this scenario and have changes be
> visible before the merge completes.  I haven't verified that this is the
> case.
> 
> Experienced devs: Can we fix this problem with DBQ?  On indexes with a
> uniqueKey, can DBQ be changed to use the two-step process I mentioned?
> 
> Thanks,
> Shawn
> 

Reply via email to