[jira] [Updated] (SOLR-445) Update Handlers abort with bad documents

JIRA Fri, 25 Apr 2014 16:57:07 -0700

     [ 
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tomás Fernández Löbbe updated SOLR-445:
---------------------------------------

    Attachment: SOLR-445-alternative.patch

My simple test to use with SolrCloud fails (not 100% of the times, but very 
frequently). This is my understanding of the problem:
It works only in the case of the update arriving to the shard leader (as it 
would fail while adding the doc locally), but if the update needs to be 
forwarded to the leader, then it will not work. 
If the request is forwarded to the leader it is done asynchronically and the 
DistributedUpdateProcessor tracks the errors internally. Finally, after all the 
docs where processed  the “finish” method is called and the 
DistributedUpdateProcessor will add one of the exceptions to the response. This 
is a problem because “processAdd” never really fails as the 
TolerantUpdateProcessor is expecting. It also can’t know the total number of 
errors, this is counted internally in the DistributedUpdateProcessor. 

As a side note, this DistributedUpdateProcessor behavior makes it “tolerant”, 
but only in some cases? A request like this:

<add>invalid-doc</add>
<add>valid-doc</add>
<add>valid-doc</add>


would leave Solr in a different state depending on who is receiving the request 
(the shard leader or a replica/follower). Is this expected?


> Update Handlers abort with bad documents
> ----------------------------------------
>
>                 Key: SOLR-445
>                 URL: https://issues.apache.org/jira/browse/SOLR-445
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>    Affects Versions: 1.3
>            Reporter: Will Johnson
>             Fix For: 4.9, 5.0
>
>         Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, 
> SOLR-445-alternative.patch, SOLR-445-alternative.patch, 
> SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, 
> SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml
>
>
> Has anyone run into the problem of handling bad documents / failures mid 
> batch.  Ie:
> <add>
>   <doc>
>     <field name="id">1</field>
>   </doc>
>   <doc>
>     <field name="id">2</field>
>     <field name="myDateField">I_AM_A_BAD_DATE</field>
>   </doc>
>   <doc>
>     <field name="id">3</field>
>   </doc>
> </add>
> Right now solr adds the first doc and then aborts.  It would seem like it 
> should either fail the entire batch or log a message/return a code and then 
> continue on to add doc 3.  Option 1 would seem to be much harder to 
> accomplish and possibly require more memory while Option 2 would require more 
> information to come back from the API.  I'm about to dig into this but I 
> thought I'd ask to see if anyone had any suggestions, thoughts or comments.   
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SOLR-445) Update Handlers abort with bad documents

Reply via email to