[
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14269854#comment-14269854
]
Hoss Man commented on SOLR-445:
-------------------------------
bq. It works only in the case of the update arriving to the shard leader (as it
would fail while adding the doc locally), but if the update needs to be
forwarded to the leader, then it will not work.
...i'm not sure if this will solve all of the problems Tomas ran into, but one
thing that might help (and was added after the latest version of hte patch was
written) is the "UpdateRequestProcessorFactory.RunAlways" marker interface. it
gives UpdateProcessorFactories a mechanism to say they want to be run as part
of hte chain even if the "update.distrib" logic would normally skip them for
already being run on a previous node (ie: the update has already been forwarded
once)
so that interface, combined with some basic checks of "am i the leader?" could
allow this processor to ensure it was always/only executing some bits of logic
on the leader.
(there might still be some problems however in terms of accurately
responding/reporting aggregate failures when batch updates involve docs that go
to differnet leaders)
> Update Handlers abort with bad documents
> ----------------------------------------
>
> Key: SOLR-445
> URL: https://issues.apache.org/jira/browse/SOLR-445
> Project: Solr
> Issue Type: Improvement
> Components: update
> Affects Versions: 1.3
> Reporter: Will Johnson
> Fix For: 4.9, Trunk
>
> Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch,
> SOLR-445-alternative.patch, SOLR-445-alternative.patch,
> SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch,
> SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml
>
>
> Has anyone run into the problem of handling bad documents / failures mid
> batch. Ie:
> <add>
> <doc>
> <field name="id">1</field>
> </doc>
> <doc>
> <field name="id">2</field>
> <field name="myDateField">I_AM_A_BAD_DATE</field>
> </doc>
> <doc>
> <field name="id">3</field>
> </doc>
> </add>
> Right now solr adds the first doc and then aborts. It would seem like it
> should either fail the entire batch or log a message/return a code and then
> continue on to add doc 3. Option 1 would seem to be much harder to
> accomplish and possibly require more memory while Option 2 would require more
> information to come back from the API. I'm about to dig into this but I
> thought I'd ask to see if anyone had any suggestions, thoughts or comments.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]