[
https://issues.apache.org/jira/browse/SOLR-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754596#comment-16754596
]
Tomás Fernández Löbbe commented on SOLR-13176:
----------------------------------------------
Much of the TLOG testing was added for the "onlyLeaderIndexes" changes, so
[~caomanhdat] can probably comment on it more. I'm not sure I follow exactly,
but if the logic in {{waitForInSyncWithLeader}} is commented out and just
returns immediately I expect lots of tests to fail, something like:
* add document
* commit
* search
won't work. All those other tests were not modified to handle TLOG replicas,
they assume the same behavior of NRT. (Except for {{TestTlogReplica}} and maybe
{{ChaosMonkeySafeLeaderWithPullReplicasTest}})
> Testing of TLOG Replicas needs to be re-instated, may be hiding bugs
> --------------------------------------------------------------------
>
> Key: SOLR-13176
> URL: https://issues.apache.org/jira/browse/SOLR-13176
> Project: Solr
> Issue Type: Sub-task
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Hoss Man
> Priority: Major
>
> As part of mark miller's push to cleanup tests, one change he made as part of
> his _big__ SOLR-12801 commit (circa Nov2018) was to dissable the randomized
> use of TLOG replicas in a lot of tests
> His comments at the time were that he suspected a lot of the problems he was
> seeing was due to a poor implementation of
> {{TestInjection.waitForInSyncWithLeader()}} (which only comes into play for
> TLOG replicas) ultimately leading to him creating SOLR-12313.
> But based on some limited experimentation I made w/trying to re-enable TLOG
> replica randomization in some tests after (essentially) removing
> {{TestInjection.waitForInSyncWithLeader()}} in SOLR-13168 i'm still seeing a
> lot of sporadic test failures when TLOG replicas get used... the only change
> is that instead of "failing slow" because of the stalls introduced by
> {{TestInjection.waitForInSyncWithLeader()}} they started failing quickly.
> *It's not clear if these failures are because the tests have bugs; or if the
> tests don't account for the expected behavior of the TLOG replica types in
> certain situations; or if the code paths being tested have bugs when dealing
> with TLOG replicas.*
> ----
> Bottom line: As things stand today, TLOG replicas aren't being very
> thoroughly tested, particularly in edge cases (http partitions, LIR, leader
> election, mixed used of replica types, etc...)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]