Sorry, I posted a wrong unrelated PR link in my previous email.

The PR for this change is https://github.com/apache/solr/pull/2063
"SOLR-17067 <https://issues.apache.org/jira/browse/SOLR-17067>
SolrCloudTestCase activeClusterShape() should only consider active shards"


On Sat, Nov 4, 2023 at 11:28 PM Ilan Ginzburg <ilans...@gmail.com> wrote:

> Created SOLR-17067 <https://issues.apache.org/jira/browse/SOLR-17067> and PR
> 1965 <https://github.com/apache/solr/pull/1965>
>
> On Sat, Nov 4, 2023 at 8:28 PM Ilan Ginzburg <ilans...@gmail.com> wrote:
>
>> Thanks Hoss for your input. I was focused on that specific method (to
>> contribute back a test change we made on our Solr fork) and was missing the
>> big picture. I therefore explored your suggestions.
>>
>> I agree activeClusterShape() is where the change makes the most sense. I
>> can of course limit it to that method.
>>
>> Note though that both clusterShape() and activeClusterShape() already
>> filter out replicas advertised as active but living on non live nodes!
>> A replica is effectively active if its state says so *and* its node is
>> live *and* its shard is active.
>>
>> When compareActiveReplicaCountsForShards() is changed to not count
>> active replicas on non active shards (impacting the two verification
>> methods), the only org.apache.solr.cloud tests that fail are split
>> related and already call activeClusterShape(). Splits end with the
>> parent shard being inactive but its replicas still active.
>> These tests correctly verify the post split number of shards (increasing
>> total count by the number of children, decreasing by one to account for the
>> parent shard), but verify a total number of replicas including the new
>> replicas for the children shards without removing the ones for the parent.
>>
>> ClusterStateUtil.waitForAllActiveAndLiveReplicas() and
>> waitForAllReplicasNotLive() verify replicas for a specific state and do
>> not look at replicas of inactive shards.
>>
>> Ilan
>>
>>
>>
>> On Fri, Nov 3, 2023 at 7:12 PM Chris Hostetter <hossman_luc...@fucit.org>
>> wrote:
>>
>>>
>>> If you only change that method, doesn't that break the existing
>>> clusterShape() method?  It appears to be intentionally distinct from
>>> activeClusterShape() ?
>>>
>>> (At a glance, activeClusterShape() is the only one that seems broken
>>> based
>>> on the way compareActiveReplicaCountsForShards() is defined.)
>>>
>>> Perhaps just reafactor the current
>>> logic in compareActiveReplicaCountsForShards() into
>>> clusterShape(), and put the logic you propose directly into
>>> activeClusterShape().
>>>
>>> Or ... simpler ...
>>>
>>> Redefine the API of compareActiveReplicaCountsForShards to take in an
>>> Iterable<Slice> instead of DocCollection, and then fix
>>> activeClusterShape() to pass in collectionState.getActiveSlices() while
>>> leaving clusterShape() alone.
>>>
>>>
>>> : Date: Fri, 3 Nov 2023 18:25:52 +0100
>>> : From: Ilan Ginzburg <ilans...@gmail.com>
>>> : Reply-To: dev@solr.apache.org
>>> : To: dev@solr.apache.org
>>> : Subject: Change active replica counting in tests
>>> :
>>> : Hi,
>>> :
>>> : I plan to change the method compareActiveReplicaCountsForShards() in
>>> : SolrCloudTestCase (code link
>>> : <
>>> https://github.com/apache/solr/blob/main/solr/test-framework/src/java/org/apache/solr/cloud/SolrCloudTestCase.java#L238
>>> >)
>>> : that counts the total number of active replicas for a collection to
>>> only
>>> : count active replicas *of active shards*.
>>> :
>>> : I will create a Jira and a PR for the change.
>>> : This mail is in case someone thinks it's a bad idea (and says so
>>> quickly
>>> : please 😅) so I don't invest time for nothing.
>>> :
>>> : Thanks,
>>> : Ilan
>>> :
>>>
>>> -Hoss
>>> http://www.lucidworks.com/
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>>> For additional commands, e-mail: dev-h...@solr.apache.org
>>
>>

Reply via email to