[ 
https://issues.apache.org/jira/browse/SOLR-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16751838#comment-16751838
 ] 

Hoss Man commented on SOLR-12313:
---------------------------------

Beyond the concerns mark previously raised regarding: documentation of purpose; 
how slow this can make tests; & wether it's correctly implemented – I've 
recently noticed two additional big problems:
 * this method causes *guaranteed* failures in 
{{ForceLeaderTest.testReplicasInLIRNoLeader}} on branch_7x when the randomizer 
enables java {{asserts}} _and_ {{onlyLeaderIndexes}} randomizes to "true" 
(resulting in the use of tlog replicas)
 ** Note: forcing {{TestInjection.waitForReplicasInSync = null}} allows this 
test to sometimes pass, but it's still finicky due to the delay in search 
results being updated in tlog replicas
 ** Mark had previously (Nov2018) modified this test class to prevent tlog 
replicas, but aparently didn't notice on backporting that 7x had this 
additional test method (deprecated & removed in master) that used the 
{{onlyLeaderIndexes}} variable directly
 * {{TestInjection.waitForInSyncWithLeader()}} does not obey the contract all 
test injection methods must obey
 ** notably it (by default) forces commits to tlog replicas to block until in 
sync with leader in non-test Solr instances if solr was started with JVM 
assertions
 ** this is an end user facing performance bottleneck, that can cause real 
world commits to time out!

Unless there are objections from someone who understands tlog replicas really 
well & is willing to step up and "fix" {{waitForInSyncWithLeader()}} (including 
mark's concerns above about how it's designed), I plan to do the following:
 # create a sub-task to document & track the user affecting bug regarding 
TLOG+assertions / commits in CHANGES.txt
 # commit a change making {{TestInjection.waitForInSyncWithLeader()}} a No-Op, 
citing both this issue and the new sub-task

 ** backporting down to 7x & resolving the sub-task as fixed in 7.7
 # commit a change to ForceLeaderTest's initialization of {{onlyLeaderIndexes}} 
completely preventing the use of tlog replicas as mark intended in his Nov2018 
commits
 ** backporting down to 7x

> TestInjection#waitForInSyncWithLeader needs improvement.
> --------------------------------------------------------
>
>                 Key: SOLR-12313
>                 URL: https://issues.apache.org/jira/browse/SOLR-12313
>             Project: Solr
>          Issue Type: Test
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Mark Miller
>            Priority: Major
>
> This really should have some doc for why it would be used.
> I also think it causes BasicDistributedZkTest to take forever for sometimes 
> and perhaps other tests?
> I think checking for uncommitted data is probably a race condition and should 
> be removed.
> Checking index versions should follow the rules that replication does - if 
> the slave is higher than the leader, it's in sync, being equal is not 
> required. If it's expected for a test it should be a specific test that 
> fails. This just introduces massive delays.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to