[
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327357#comment-15327357
]
ASF GitHub Bot commented on SOLR-445:
-------------------------------------
GitHub user arafalov opened a pull request:
https://github.com/apache/lucene-solr/pull/43
Trivial name spelling fix for SOLR-445
ToleranteUpdateProcessorFactory -> ToleranteUpdateProcessorFactory
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/arafalov/lucene-solr-1 patch-3
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/lucene-solr/pull/43.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #43
----
commit 6742355f93f0d2d03600fe408b542507ee89bf54
Author: Alexandre Rafalovitch <[email protected]>
Date: 2016-06-13T13:19:25Z
Trivial Spelling fix
ToleranteUpdateProcessorFactory -> TolerantUpdateProcessorFactory
commit ebffa9aa2aebd689db53ba363d5022b893c7eeb0
Author: Alexandre Rafalovitch <[email protected]>
Date: 2016-06-13T13:22:49Z
Trivial Spelling fix
ToleranteUpdateProcessorFactory -> TolerantUpdateProcessorFactory
----
> Update Handlers abort with bad documents
> ----------------------------------------
>
> Key: SOLR-445
> URL: https://issues.apache.org/jira/browse/SOLR-445
> Project: Solr
> Issue Type: Improvement
> Components: update
> Reporter: Will Johnson
> Assignee: Hoss Man
> Fix For: 6.1, master (7.0)
>
> Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch,
> SOLR-445-alternative.patch, SOLR-445-alternative.patch,
> SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch,
> SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch,
> SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml
>
>
> This issue adds a new {{TolerantUpdateProcessorFactory}} making it possible
> to configure solr updates so that they are "tolerant" of individual errors in
> an update request...
> {code}
> <processor class="solr.TolerantUpdateProcessorFactory">
> <int name="maxErrors">10</int>
> </processor>
> {code}
> When a chain with this processor is used, but maxErrors isn't exceeded,
> here's what the response looks like...
> {code}
> $ curl
> 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain&wt=json&indent=true&maxErrors=-1'
> -H "Content-Type: application/json" --data-binary '{"add" : {
> "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}'
> {
> "responseHeader":{
> "errors":[{
> "type":"ADD",
> "id":"1",
> "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For
> input string: \"bogus\""},
> {
> "type":"DELQ",
> "id":"malformed:[",
> "message":"org.apache.solr.search.SyntaxError: Cannot parse
> 'malformed:[': Encountered \"<EOF>\" at line 1, column 11.\nWas expecting one
> of:\n <RANGE_QUOTED> ...\n <RANGE_GOOP> ...\n "}],
> "maxErrors":-1,
> "status":0,
> "QTime":1}}
> {code}
> Note in the above example that:
> * maxErrors can be overridden on a per-request basis
> * an effective {{maxErrors==-1}} (either from config, or request param) means
> "unlimited" (under the covers it's using {{Integer.MAX_VALUE}})
> If/When maxErrors is reached for a request, then the _first_ exception that
> the processor caught is propagated back to the user, and metadata is set on
> that exception with all of the same details about all the tolerated errors.
> This next example is the same as the previous except that instead of
> {{maxErrors=-1}} the request param is now {{maxErrors=1}}...
> {code}
> $ curl
> 'http://localhost:8983/solr/techproducts/update?update.chain=tolerant-chain&wt=json&indent=true&maxErrors=1'
> -H "Content-Type: application/json" --data-binary '{"add" : {
> "doc":{"id":"1","foo_i":"bogus"}}, "delete": {"query":"malformed:["}}'
> {
> "responseHeader":{
> "errors":[{
> "type":"ADD",
> "id":"1",
> "message":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For
> input string: \"bogus\""},
> {
> "type":"DELQ",
> "id":"malformed:[",
> "message":"org.apache.solr.search.SyntaxError: Cannot parse
> 'malformed:[': Encountered \"<EOF>\" at line 1, column 11.\nWas expecting one
> of:\n <RANGE_QUOTED> ...\n <RANGE_GOOP> ...\n "}],
> "maxErrors":1,
> "status":400,
> "QTime":1},
> "error":{
> "metadata":[
> "org.apache.solr.common.ToleratedUpdateError--ADD:1","ERROR: [doc=1]
> Error adding field 'foo_i'='bogus' msg=For input string: \"bogus\"",
>
> "org.apache.solr.common.ToleratedUpdateError--DELQ:malformed:[","org.apache.solr.search.SyntaxError:
> Cannot parse 'malformed:[': Encountered \"<EOF>\" at line 1, column 11.\nWas
> expecting one of:\n <RANGE_QUOTED> ...\n <RANGE_GOOP> ...\n ",
> "error-class","org.apache.solr.common.SolrException",
> "root-error-class","java.lang.NumberFormatException"],
> "msg":"ERROR: [doc=1] Error adding field 'foo_i'='bogus' msg=For input
> string: \"bogus\"",
> "code":400}}
> {code}
> ...the added exception metadata ensures that even in client code like the
> various SolrJ SolrClient implementations, which throw a (client side)
> exception on non-200 responses, the end user can access info on all the
> tolerated errors that were ignored before the maxErrors threshold was reached.
> ----
> {panel:title=Original Jira Request}
> Has anyone run into the problem of handling bad documents / failures mid
> batch. Ie:
> <add>
> <doc>
> <field name="id">1</field>
> </doc>
> <doc>
> <field name="id">2</field>
> <field name="myDateField">I_AM_A_BAD_DATE</field>
> </doc>
> <doc>
> <field name="id">3</field>
> </doc>
> </add>
> Right now solr adds the first doc and then aborts. It would seem like it
> should either fail the entire batch or log a message/return a code and then
> continue on to add doc 3. Option 1 would seem to be much harder to
> accomplish and possibly require more memory while Option 2 would require more
> information to come back from the API. I'm about to dig into this but I
> thought I'd ask to see if anyone had any suggestions, thoughts or comments.
>
> {panel}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]