Sorry, I posted a wrong unrelated PR link in my previous email. The PR for this change is https://github.com/apache/solr/pull/2063 "SOLR-17067 <https://issues.apache.org/jira/browse/SOLR-17067> SolrCloudTestCase activeClusterShape() should only consider active shards"
On Sat, Nov 4, 2023 at 11:28 PM Ilan Ginzburg <ilans...@gmail.com> wrote: > Created SOLR-17067 <https://issues.apache.org/jira/browse/SOLR-17067> and PR > 1965 <https://github.com/apache/solr/pull/1965> > > On Sat, Nov 4, 2023 at 8:28 PM Ilan Ginzburg <ilans...@gmail.com> wrote: > >> Thanks Hoss for your input. I was focused on that specific method (to >> contribute back a test change we made on our Solr fork) and was missing the >> big picture. I therefore explored your suggestions. >> >> I agree activeClusterShape() is where the change makes the most sense. I >> can of course limit it to that method. >> >> Note though that both clusterShape() and activeClusterShape() already >> filter out replicas advertised as active but living on non live nodes! >> A replica is effectively active if its state says so *and* its node is >> live *and* its shard is active. >> >> When compareActiveReplicaCountsForShards() is changed to not count >> active replicas on non active shards (impacting the two verification >> methods), the only org.apache.solr.cloud tests that fail are split >> related and already call activeClusterShape(). Splits end with the >> parent shard being inactive but its replicas still active. >> These tests correctly verify the post split number of shards (increasing >> total count by the number of children, decreasing by one to account for the >> parent shard), but verify a total number of replicas including the new >> replicas for the children shards without removing the ones for the parent. >> >> ClusterStateUtil.waitForAllActiveAndLiveReplicas() and >> waitForAllReplicasNotLive() verify replicas for a specific state and do >> not look at replicas of inactive shards. >> >> Ilan >> >> >> >> On Fri, Nov 3, 2023 at 7:12 PM Chris Hostetter <hossman_luc...@fucit.org> >> wrote: >> >>> >>> If you only change that method, doesn't that break the existing >>> clusterShape() method? It appears to be intentionally distinct from >>> activeClusterShape() ? >>> >>> (At a glance, activeClusterShape() is the only one that seems broken >>> based >>> on the way compareActiveReplicaCountsForShards() is defined.) >>> >>> Perhaps just reafactor the current >>> logic in compareActiveReplicaCountsForShards() into >>> clusterShape(), and put the logic you propose directly into >>> activeClusterShape(). >>> >>> Or ... simpler ... >>> >>> Redefine the API of compareActiveReplicaCountsForShards to take in an >>> Iterable<Slice> instead of DocCollection, and then fix >>> activeClusterShape() to pass in collectionState.getActiveSlices() while >>> leaving clusterShape() alone. >>> >>> >>> : Date: Fri, 3 Nov 2023 18:25:52 +0100 >>> : From: Ilan Ginzburg <ilans...@gmail.com> >>> : Reply-To: dev@solr.apache.org >>> : To: dev@solr.apache.org >>> : Subject: Change active replica counting in tests >>> : >>> : Hi, >>> : >>> : I plan to change the method compareActiveReplicaCountsForShards() in >>> : SolrCloudTestCase (code link >>> : < >>> https://github.com/apache/solr/blob/main/solr/test-framework/src/java/org/apache/solr/cloud/SolrCloudTestCase.java#L238 >>> >) >>> : that counts the total number of active replicas for a collection to >>> only >>> : count active replicas *of active shards*. >>> : >>> : I will create a Jira and a PR for the change. >>> : This mail is in case someone thinks it's a bad idea (and says so >>> quickly >>> : please 😅) so I don't invest time for nothing. >>> : >>> : Thanks, >>> : Ilan >>> : >>> >>> -Hoss >>> http://www.lucidworks.com/ >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >>> For additional commands, e-mail: dev-h...@solr.apache.org >> >>