Hi Emir, We are deleting a larger subset of docs with a particular value which we know based on the id and only updating a few of the deleted. Our document is of the form <type>_<part1>_<part2>, we need to delete all that has the same <part1>, that are no longer in DB and then update only a few that has been updated in DB.
Thanks, Sujatha On Sun, Jun 24, 2018 at 8:59 AM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > Hi Sujatha, > Did I get it right that you are deleting the same documents that will be > updated afterward? If that’s the case, then you can simply skip deleting, > and just send updated version of document. Solr (Lucene) does not have > delete - it’ll just flag document as deleted. Updating document (assuming > id is the same) will result in the same thing - old document will not be > retrievable and will be removed from index when segments holding it is > merged. > > HTH, > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > On 21 Jun 2018, at 19:59, sujatha sankaran <suja.arun2...@gmail.com> > wrote: > > > > Thanks,Shawn. > > > > Our use case is something like this in a batch load of several 1000's of > > documents,we do a delete first followed by update.Example delete all 1000 > > docs and send an update request for 1000. > > > > What we see is that there are many missing docs due to DBQ re-ordering of > > the order of deletes followed by updates.We also saw issue with nodes > > going down > > similar tot issue described here: > > http://lucene.472066.n3.nabble.com/SolrCloud-Nodes- > going-to-recovery-state-during-indexing-td4369396.html > > > > we see at the end of this batch process, many (several thousand ) missing > > docs. > > > > Due to this and after reading above thread , we decided to move to DBI > and > > now are facing issues due to custom routing or implicit routing which we > > have in place.So I don't think DBQ was working for us, but we did have > > several such process ( DBQ followed by updates) for different activities > in > > the collection happening at the same time. > > > > > > Sujatha > > > > On Thu, Jun 21, 2018 at 1:21 PM, Shawn Heisey <apa...@elyograg.org> > wrote: > > > >> On 6/21/2018 9:59 AM, sujatha sankaran wrote: > >>> Currently from our business perspective we find that we are left with > no > >>> options for deleting docs in a batch load as : > >>> > >>> DBQ+ batch does not work well together > >>> DBI+ custom routing (batch load / normal) would not work as well. > >> > >> I would expect DBQ to work, just with the caveat that if you are trying > >> to do other indexing operations at the same time, you may run into > >> significant delays, and if there are timeouts configured anywhere that > >> are shorter than those delays, requests may return failure responses or > >> log failures. > >> > >> If you are using DBQ, you just need to be sure that there are no other > >> operations happening at the same time, or that your error handling is > >> bulletproof. Making sure that no other operations are happening at the > >> same time as the DBQ is in my opinion a better option. > >> > >> Thanks, > >> Shawn > >> > >> > >