[
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Erick Erickson updated SOLR-445:
--------------------------------
Attachment: SOLR-445.patch
So, Grant. How do you feel about refactorings <G>?
I got bitten by this problem again so I decided to dust off the patch, and I
re-created it. This one shouldn't have the gratuitous re-formatting. But, after
I added the bookkeeping, the method got even more unwieldy, so I extracted some
of the code to methods in XMLLoader. I also have the un-refactored version if
this one is too painful.
This patch incorporates the changes you suggested months ago. I'm a little
uncertain whether putting a constant in UpdateParams.java was the correct
place, but it seemed like a pattern used for other parameters.
One minor issue: The behavior is the same here as it used to be if you don't
start the packet with <add>. An NPE is thrown. That's because the addCmd
variable isn't initialized until the <add> tag is encountered and the NPE is a
result of using the addCmd variable later (I think I was seeing it at line
118). I think it would be better to fail if the first element wasn't an <add>
element rather than because it just happens to cause an NPE.
While I'm at it, though, what do you think about making this robust enough to
ignore ?xml and/or !DOCTYPE entries? Or is that just not worth the bother?
Erick
> XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
> --------------------------------------------------------------------
>
> Key: SOLR-445
> URL: https://issues.apache.org/jira/browse/SOLR-445
> Project: Solr
> Issue Type: Bug
> Components: update
> Affects Versions: 1.3
> Reporter: Will Johnson
> Assignee: Grant Ingersoll
> Fix For: Next
>
> Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch,
> SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml
>
>
> Has anyone run into the problem of handling bad documents / failures mid
> batch. Ie:
> <add>
> <doc>
> <field name="id">1</field>
> </doc>
> <doc>
> <field name="id">2</field>
> <field name="myDateField">I_AM_A_BAD_DATE</field>
> </doc>
> <doc>
> <field name="id">3</field>
> </doc>
> </add>
> Right now solr adds the first doc and then aborts. It would seem like it
> should either fail the entire batch or log a message/return a code and then
> continue on to add doc 3. Option 1 would seem to be much harder to
> accomplish and possibly require more memory while Option 2 would require more
> information to come back from the API. I'm about to dig into this but I
> thought I'd ask to see if anyone had any suggestions, thoughts or comments.
>
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]