[
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974667#comment-13974667
]
Hoss Man commented on SOLR-445:
-------------------------------
bq. I think this would make it more confusing. Having this processor means that
the client wants to manage failing docs on their side. If all the docs fail so
be it.
Yeah, i'm not convinced you're wrong -- I just wasn't sure how i felt about it
and I wanted to make we considered. Even if users configure this, they might
be surprised if something like a a schema.xml mismatch with some update process
they are using causes a 500 error on every individual udpate -- but still
results in a 200 coming back because of this component.
But I think you are right -- as long as the docs are clear that the status
will _allways_ be a 200, even if all docs fail, we're fine.
bq. I was also thinking that this processor won’t work together with
DistributedUpdateProcessor, it has its own error processing, plus the
distribution would create multiple internal requests...
As long as this processor is configured before the
DistributedUpdateProcessorFactory it should work fine:
* when the requests get forwarded to other shards, they'll bypass this
processor (and any other processors that come before
DistributedUpdateProcessorFactory) so it won't break the cumulative error
handling in DistributedUpdateProcessorFactory
* DistributedUpdateProcessorFactory still ultimately throws only one Exception
per UpdateCommand when it forwards to multiple replicas, so your new processor
will still get at most 1 error to track per doc when accumulating results to
return to the client
but it's trivial to write a distributed version of your test case to prove that
you get the results you expect -- probably a good idea to write one to help
future proof this processor against unforeseen future changes in the
distributed update processing
> Update Handlers abort with bad documents
> ----------------------------------------
>
> Key: SOLR-445
> URL: https://issues.apache.org/jira/browse/SOLR-445
> Project: Solr
> Issue Type: Improvement
> Components: update
> Affects Versions: 1.3
> Reporter: Will Johnson
> Fix For: 4.9, 5.0
>
> Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch,
> SOLR-445-alternative.patch, SOLR-445-alternative.patch, SOLR-445.patch,
> SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch,
> solr-445.xml
>
>
> Has anyone run into the problem of handling bad documents / failures mid
> batch. Ie:
> <add>
> <doc>
> <field name="id">1</field>
> </doc>
> <doc>
> <field name="id">2</field>
> <field name="myDateField">I_AM_A_BAD_DATE</field>
> </doc>
> <doc>
> <field name="id">3</field>
> </doc>
> </add>
> Right now solr adds the first doc and then aborts. It would seem like it
> should either fail the entire batch or log a message/return a code and then
> continue on to add doc 3. Option 1 would seem to be much harder to
> accomplish and possibly require more memory while Option 2 would require more
> information to come back from the API. I'm about to dig into this but I
> thought I'd ask to see if anyone had any suggestions, thoughts or comments.
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]