[
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15206749#comment-15206749
]
Hoss Man commented on SOLR-445:
-------------------------------
bq. What is the impact of many docs failing due to missing ID? Is there a test
for that? I couldn't find one, but the diff is pretty big, I may have missed
stuff.
good question -- there were checks of this in TolerantUpdateProcessorTest (from
the early days of this patch) but i added some to
TestTolerantUpdateProcessorCloud which uncovered a bug (now fixed) when
checking isLeader -- see: cc2cd23ca2537324dc7e4afe6a29605bbf9f1cb8
bq. Don't know the answer to the "isLeader" question. I'd say the request would
fail if leader changes in the middle of a request, but I'm not sure.
Hmm... can you explain more what you think/expect could go wrong with the
isLeader code removed that wouldn't go wrong with the code as it is today? I
mean ... theoretically, even with the isLeader check as we have it right now,
the leader could change between the time we do the isLeader check and the call
to super.processAdd (where DUP will do it's own isLeader check) ... or it could
change (again) between the time super.processAdd/DUP.processAdd throws an
exception and the time we make a decision wetherto only track it or track and
immediately re-throw.
I'm just not sure if that added code is really gaining us anything useful --
but if someone can help me understand (or better still: demonstrate with a
test) a concrete situation where the current code does the correct thing, but
removeing the isLeader check is broken then i'll be convinced.
----
Where things currently stand:
* The only remaining nocommits on the branch are questions about deleting the
isLeader code, and questions about deleting DistribTolerantUpdateProcessorTest
since we have other more robust cloud tests now.
* Even with the "retry after giving serachers time to reopen" logic in
TestTolerantUpdateProcessorRandomCloud, i'm seeing a failure that reproduces
consistently for me...{noformat}
[junit4] 2> NOTE: reproduce with: ant test
-Dtestcase=TestTolerantUpdateProcessorRandomCloud
-Dtests.method=testRandomUpdates -Dtests.seed=ECFD2B9118A542E7
-Dtests.slow=true -Dtests.locale=bg -Dtests.timezone=Asia/Taipei
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
[junit4] FAILURE 6.00s |
TestTolerantUpdateProcessorRandomCloud.testRandomUpdates <<<
[junit4] > Throwable #1: java.lang.AssertionError: cloud client doc count
doesn't match bitself cardinality expected:<22> but was:<23>
{noformat}...so i'm currently working to improve the logging and trace through
the test to understand that.
> Update Handlers abort with bad documents
> ----------------------------------------
>
> Key: SOLR-445
> URL: https://issues.apache.org/jira/browse/SOLR-445
> Project: Solr
> Issue Type: Improvement
> Components: update
> Affects Versions: 1.3
> Reporter: Will Johnson
> Assignee: Hoss Man
> Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch,
> SOLR-445-alternative.patch, SOLR-445-alternative.patch,
> SOLR-445-alternative.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch,
> SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch,
> SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml
>
>
> Has anyone run into the problem of handling bad documents / failures mid
> batch. Ie:
> <add>
> <doc>
> <field name="id">1</field>
> </doc>
> <doc>
> <field name="id">2</field>
> <field name="myDateField">I_AM_A_BAD_DATE</field>
> </doc>
> <doc>
> <field name="id">3</field>
> </doc>
> </add>
> Right now solr adds the first doc and then aborts. It would seem like it
> should either fail the entire batch or log a message/return a code and then
> continue on to add doc 3. Option 1 would seem to be much harder to
> accomplish and possibly require more memory while Option 2 would require more
> information to come back from the API. I'm about to dig into this but I
> thought I'd ask to see if anyone had any suggestions, thoughts or comments.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]