[ 
https://issues.apache.org/jira/browse/SOLR-11035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818171#comment-16818171
 ] 

Erick Erickson commented on SOLR-11035:
---------------------------------------

Now here we have a collossally ugly patch. We've known for a long time that 
there's a problem here, but we don't have a root cause. Meanwhile, here's a 
bandaid patch for tests that are sensitive to this issue, with a "fixer uppoer" 
method cleverly named Solr11035BandAid. DocValuesNotIndexedTest is the one I 
was working on this weekend and have used it in.

The root problem here is that I can:
> commit some docs synchronously, so when it returns I should have a new 
> searcher that sees them.
> go search for those docs added above and not find them. No matter how long I 
> wait.

So this patch creates a utility function in SolrTestCaseJ4 that we can call for 
"impossible" failures with this pattern that:
> checks to see whether the counts for the query passed in match the 
> expectation. If it does, return. Otherwise
> indexes a bogus doc (with commit)
> deletes that same doc (with commit)
> checks numFound again and fails if they don't match.

What I can guarantee:

> DocValuesNotIndexedTest would fail about 10-15% of the time with the test 
> case in the comments of the new method in SolrTestCaseJ4
> I see the log messages regularly from the new method, but the test calling it 
> succeeds
> This is not a good fix, but it'll reduce the noise until we figure out a 
> proper fix
> Once the underlying cause is fixed, we can comment out the body this method 
> to see if the problem is really gone. If so, nuke it.

I'll commit this soon, and as other tests come up that have the same pattern we 
can add the call to the new method. Precommit passes, and all the 
DocValuesNotIndexedTest tests use it, but no others. I'm calling it 
SOLR-11035-bandaid.patch to keep it distinct from the _real_ fix.

Callers will have to take some care to know how many docs _should_ be found, 
which will be trickier when random numbers of docs are indexed. Any test that 
depends on merging predictably can't use it, etc.

> (at least) 2 distinct failures possible when clients attempt searches during 
> SolrCore reload
> --------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11035
>                 URL: https://issues.apache.org/jira/browse/SOLR-11035
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Hoss Man
>            Assignee: Erick Erickson
>            Priority: Major
>         Attachments: SOLR-11035-bandaid.patch, SOLR-11035.patch, log.txt, 
> log.txt
>
>
> If a SolrCore is reloaded, there are (at least) 2 distinct types of failures 
> that clients may observe when executing updates + queries while the reload is 
> in progress...
> * documents may appear missing during queries
> * queries may fail with "SolrException: openNewSearcher called on closed core"
> Details to follow in comment...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to