Thanks Hoss for your input. I was focused on that specific method (to
contribute back a test change we made on our Solr fork) and was missing the
big picture. I therefore explored your suggestions.

I agree activeClusterShape() is where the change makes the most sense. I
can of course limit it to that method.

Note though that both clusterShape() and activeClusterShape() already
filter out replicas advertised as active but living on non live nodes!
A replica is effectively active if its state says so *and* its node is live
*and* its shard is active.

When compareActiveReplicaCountsForShards() is changed to not count active
replicas on non active shards (impacting the two verification methods), the
only org.apache.solr.cloud tests that fail are split related and already
call activeClusterShape(). Splits end with the parent shard being inactive
but its replicas still active.
These tests correctly verify the post split number of shards (increasing
total count by the number of children, decreasing by one to account for the
parent shard), but verify a total number of replicas including the new
replicas for the children shards without removing the ones for the parent.

ClusterStateUtil.waitForAllActiveAndLiveReplicas() and
waitForAllReplicasNotLive() verify replicas for a specific state and do not
look at replicas of inactive shards.

Ilan



On Fri, Nov 3, 2023 at 7:12 PM Chris Hostetter <hossman_luc...@fucit.org>
wrote:

>
> If you only change that method, doesn't that break the existing
> clusterShape() method?  It appears to be intentionally distinct from
> activeClusterShape() ?
>
> (At a glance, activeClusterShape() is the only one that seems broken based
> on the way compareActiveReplicaCountsForShards() is defined.)
>
> Perhaps just reafactor the current
> logic in compareActiveReplicaCountsForShards() into
> clusterShape(), and put the logic you propose directly into
> activeClusterShape().
>
> Or ... simpler ...
>
> Redefine the API of compareActiveReplicaCountsForShards to take in an
> Iterable<Slice> instead of DocCollection, and then fix
> activeClusterShape() to pass in collectionState.getActiveSlices() while
> leaving clusterShape() alone.
>
>
> : Date: Fri, 3 Nov 2023 18:25:52 +0100
> : From: Ilan Ginzburg <ilans...@gmail.com>
> : Reply-To: dev@solr.apache.org
> : To: dev@solr.apache.org
> : Subject: Change active replica counting in tests
> :
> : Hi,
> :
> : I plan to change the method compareActiveReplicaCountsForShards() in
> : SolrCloudTestCase (code link
> : <
> https://github.com/apache/solr/blob/main/solr/test-framework/src/java/org/apache/solr/cloud/SolrCloudTestCase.java#L238
> >)
> : that counts the total number of active replicas for a collection to only
> : count active replicas *of active shards*.
> :
> : I will create a Jira and a PR for the change.
> : This mail is in case someone thinks it's a bad idea (and says so quickly
> : please 😅) so I don't invest time for nothing.
> :
> : Thanks,
> : Ilan
> :
>
> -Hoss
> http://www.lucidworks.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org

Reply via email to