Hi Sujatha,
Did I get it right that you are deleting the same documents that will be 
updated afterward? If that’s the case, then you can simply skip deleting, and 
just send updated version of document. Solr (Lucene) does not have delete - 
it’ll just flag document as deleted. Updating document (assuming id is the 
same) will result in the same thing - old document will not be retrievable and 
will be removed from index when segments holding it is merged.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 21 Jun 2018, at 19:59, sujatha sankaran <suja.arun2...@gmail.com> wrote:
> 
> Thanks,Shawn.
> 
> Our use case is something like this in a batch load of  several 1000's of
> documents,we do a delete first followed by update.Example delete all 1000
> docs and send an update request for 1000.
> 
> What we see is that there are many missing docs due to DBQ re-ordering of
> the order of  deletes followed by updates.We also saw issue with nodes
> going down
> similar tot issue described here:
> http://lucene.472066.n3.nabble.com/SolrCloud-Nodes-going-to-recovery-state-during-indexing-td4369396.html
> 
> we see at the end of this batch process, many (several thousand ) missing
> docs.
> 
> Due to this and after reading above thread , we decided to move to DBI and
> now are facing issues due to custom routing or implicit routing which we
> have in place.So I don't think DBQ was working for us, but we did have
> several such process ( DBQ followed by updates) for different activities in
> the collection happening at the same time.
> 
> 
> Sujatha
> 
> On Thu, Jun 21, 2018 at 1:21 PM, Shawn Heisey <apa...@elyograg.org> wrote:
> 
>> On 6/21/2018 9:59 AM, sujatha sankaran wrote:
>>> Currently from our business perspective we find that we are left with no
>>> options for deleting docs in a batch load as :
>>> 
>>> DBQ+ batch does not work well together
>>> DBI+ custom routing (batch load / normal)    would not work as well.
>> 
>> I would expect DBQ to work, just with the caveat that if you are trying
>> to do other indexing operations at the same time, you may run into
>> significant delays, and if there are timeouts configured anywhere that
>> are shorter than those delays, requests may return failure responses or
>> log failures.
>> 
>> If you are using DBQ, you just need to be sure that there are no other
>> operations happening at the same time, or that your error handling is
>> bulletproof.  Making sure that no other operations are happening at the
>> same time as the DBQ is in my opinion a better option.
>> 
>> Thanks,
>> Shawn
>> 
>> 

Reply via email to