Hoss Man created SOLR-13176:
-------------------------------

             Summary: Testing of TLOG Replicas needs to be re-instated, may be 
hiding bugs
                 Key: SOLR-13176
                 URL: https://issues.apache.org/jira/browse/SOLR-13176
             Project: Solr
          Issue Type: Sub-task
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Hoss Man


As part of mark miller's push to cleanup tests, one change he made as part of 
his _big__ SOLR-12801 commit (circa Nov2018) was to dissable the randomized use 
of TLOG replicas in a lot of tests

His comments at the time were that he suspected a lot of the problems he was 
seeing was due to a poor implementation of 
{{TestInjection.waitForInSyncWithLeader()}} (which only comes into play for 
TLOG replicas) ultimately leading to him creating SOLR-12313.

But based on some limited experimentation I made w/trying to re-enable TLOG 
replica randomization in some tests after (essentially) removing 
{{TestInjection.waitForInSyncWithLeader()}} in SOLR-13168 i'm still seeing a 
lot of sporadic test failures when TLOG replicas get used... the only change is 
that instead of "failing slow" because of the stalls introduced by 
{{TestInjection.waitForInSyncWithLeader()}} they started failing quickly.

*It's not clear if these failures are because the tests have bugs; or if the 
tests don't account for the expected behavior of the TLOG replica types in 
certain situations; or if the code paths being tested have bugs when dealing 
with TLOG replicas.*

----

Bottom line: As things stand today, TLOG replicas aren't being very thoroughly 
tested, particularly in edge cases (http partitions, LIR, leader election, 
mixed used of replica types, etc...)




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to