This has been hanging around for a long time. I did some preliminary work here: https://issues.apache.org/jira/browse/SOLR-445 but moved on to other things before committing it. The discussion there might be useful.
FWIW, Erick On Wed, Feb 27, 2013 at 5:32 AM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > Colleagues, > > Here are my considerations > > If the exception is occurs somewhere in updateprocessor we can add a > special update processor on top of the head of update processor chain, > which will catch exception from delegated processAdd call, log and/or > swallow it. > If it fits for the purpose we can try to figure out how to return failed > doc ids back to the client. I'm not sure but i think it's possible. Just > because responsewrite is quite -dumb- flexible, i e if update processor > drops something to response, it should be blindly streamed back to the > client. > > One more consideration. > Anirudha, > When you say "re-try them" do you mean to post a failed doc one more time? > It seems I didn't get your point. Please clarify. > 27.02.2013 1:13 пользователь "Anirudha Jadhav" <aniru...@nyu.edu> > написал: > > > Ideally you would want to use SOLRJ or other interface which can catch > > exceptions/error and re-try them. > > > > > > On Tue, Feb 26, 2013 at 3:45 PM, Walter Underwood <wun...@wunderwood.org > > >wrote: > > > > > I've done exactly the same thing. On error, set the batch size to one > and > > > try again. > > > > > > wunder > > > > > > On Feb 26, 2013, at 12:27 PM, Timothy Potter wrote: > > > > > > > Here's what I do to work-around failures when processing batches of > > > updates: > > > > > > > > On client side, catch the exception that the batch failed. In the > > > > exception handler, switch to one-by-one mode for the failed batch > > > > only. > > > > > > > > This allows you to isolate the *bad* documents as well as getting the > > > > *good* documents in the batch indexed in Solr. > > > > > > > > This assumes most batches work so you only pay the one-by-one penalty > > > > for the occasional batch with a bad doc. > > > > > > > > Tim > > > > > > > > On Tue, Feb 26, 2013 at 12:08 PM, Isaac Hebsh <isaac.he...@gmail.com > > > > > wrote: > > > >> Hi. > > > >> > > > >> I add documents to Solr by POSTing them to UpdateHandler, as bulks > of > > > <add> > > > >> commands (DIH is not used). > > > >> > > > >> If one document contains any invalid data (e.g. string data into > > numeric > > > >> field), Solr returns HTTP 400 Bad Request, and the whole bulk is > > failed. > > > >> > > > >> I'm searching for a way to tell Solr to accept the rest of the > > > documents... > > > >> (I'll use RealTimeGet to determine which documents were added). > > > >> > > > >> If there is no standard way for doing it, maybe it can be > implemented > > by > > > >> spiltting the <add> commands into seperate HTTP POSTs. Because of > > using > > > >> auto-soft-commit, can I say that it is almost equivalent? What is > the > > > >> performance penalty of 100 POST requests (of 1 document each) > againt 1 > > > >> request of 100 docs, if a soft commit is eventually done. > > > >> > > > >> Thanks in advance... > > > > > > -- > > > Walter Underwood > > > wun...@wunderwood.org > > > > > > > > > > > > > > > > > > -- > > Anirudha P. Jadhav > > >