Re: Knowing which doc failed to get added in solr during bulk addition in Solr 5.2

Erick Erickson Thu, 11 Feb 2016 10:07:19 -0800

Steven's solution is a very common one, complete to the
notion of re-chunking. Depending on the throughput requirements,
simply resending the offending packet one at a time is often
sufficient (but not _efficient). I can imagine fallback scenarios
like "try chunking 100 at a time, for those chunks that fail
do 10 at a time and for those do 1 at a time".


That said, in a lot of situations, the number of failures is low
enough that just falling back to one at a time while not elegant
is sufficient....

It sure will be nice to have SOLR-445 done, if we can just keep
Hoss from going crazy before he gets done.

Best,
Erick

On Thu, Feb 11, 2016 at 7:39 AM, Steven White <[email protected]> wrote:
> For my application, the solution I implemented is I log the chunk that
> failed into a file.  This file is than post processed one record at a
> time.  The ones that fail, are reported to the admin and never looked at
> again until the admin takes action.  This is not the most efficient
> solution right now but I intend to refactor this code so that the failed
> chunk is itself re-processed in smaller chunks till the chunk with the
> failed record(s) is down to 1 record "chunk" that will fail.
>
> Like Debraj, I would love to hear from others how they handle such failures.
>
> Steve
>
>
> On Thu, Feb 11, 2016 at 2:29 AM, Debraj Manna <[email protected]>
> wrote:
>
>> Thanks Erik. How do people handle this scenario? Right now the only option
>> I can think of is to replay the entire batch by doing add for every single
>> doc. Then this will give me error for all the docs which got added from the
>> batch.
>>
>> On Tue, Feb 9, 2016 at 10:57 PM, Erick Erickson <[email protected]>
>> wrote:
>>
>> > This has been a long standing issue, Hoss is doing some current work on
>> it
>> > see:
>> > https://issues.apache.org/jira/browse/SOLR-445
>> >
>> > But the short form is "no, not yet".
>> >
>> > Best,
>> > Erick
>> >
>> > On Tue, Feb 9, 2016 at 8:19 AM, Debraj Manna <[email protected]>
>> > wrote:
>> > > Hi,
>> > >
>> > >
>> > >
>> > > I have a Document Centric Versioning Constraints added in solr schema:-
>> > >
>> > > <processor class="solr.DocBasedVersionConstraintsProcessorFactory">
>> > >   <bool name="ignoreOldUpdates">false</bool>
>> > >   <str name="versionField">doc_version</str>
>> > > </processor>
>> > >
>> > > I am adding multiple documents in solr in a single call using SolrJ
>> 5.2.
>> > > The code fragment looks something like below :-
>> > >
>> > >
>> > > try {
>> > >         UpdateResponse resp = solrClient.add(docs.getDocCollection(),
>> > >             500);
>> > >         if (resp.getStatus() != 0) {
>> > >         throw new Exception(new StringBuilder(
>> > >             "Failed to add docs in solr ").append(resp.toString())
>> > >             .toString());
>> > >         }
>> > >     } catch (Exception e) {
>> > >         logError("Adding docs to solr failed", e);
>> > >     }
>> > >
>> > >
>> > > If one of the document is violating the versioning constraints then
>> Solr
>> > is
>> > > returning an exception with error message like "user version is not
>> high
>> > > enough: 1454587156" & the other documents are getting added perfectly.
>> Is
>> > > there a way I can know which document is violating the constraints
>> either
>> > > in Solr logs or from the Update response returned by Solr?
>> > >
>> > > Thanks
>> >
>>

Re: Knowing which doc failed to get added in solr during bulk addition in Solr 5.2

Reply via email to