[
https://issues.apache.org/jira/browse/IGNITE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mikhail Efremov updated IGNITE-22928:
-------------------------------------
Ignite Flags: (was: Docs Required,Release Notes Required)
Labels: ignite-3 (was: )
> Fix testZoneReplicaListener
> ---------------------------
>
> Key: IGNITE-22928
> URL: https://issues.apache.org/jira/browse/IGNITE-22928
> Project: Ignite
> Issue Type: Improvement
> Reporter: Mikhail Efremov
> Assignee: Mikhail Efremov
> Priority: Major
> Labels: ignite-3
>
> *Description*
> The issue with test is \{{TestPlacementDriver}} that returns only one node
> that may not be in replication group at least at start of the test and thus
> have no any replica and raft entities. It leads to \{{NPE}} in the follow
> code from \{{PartitionReplicaLifecycleManager}}:
> {code:title=|language=java|collapse=false}return localServicesStartFuture
> .thenComposeAsync(v -> inBusyLock(busyLock, () ->
> isLocalNodeIsPrimary(replicaGrpId)), ioExecutor)
> .thenAcceptAsync(isLeaseholder -> inBusyLock(busyLock, () -> {
> boolean isLocalNodeInStableOrPending =
> isNodeInReducedStableOrPendingAssignments(
> replicaGrpId,
> stableAssignments,
> pendingAssignments,
> revision
> );
> if (!isLocalNodeInStableOrPending && !isLeaseholder) {
> return;
> }
> assert isLocalNodeInStableOrPending || isLeaseholder
> : "The local node is outside of the replication
> group [inStableOrPending=" + isLocalNodeInStableOrPending
> + ", isLeaseholder=" + isLeaseholder + "].";
> // For forced assignments, we exclude dead stable nodes,
> and all alive stable nodes are already in pending assignments.
> // Union is not required in such a case.
> Set<Assignment> newAssignments =
> pendingAssignmentsAreForced || stableAssignments == null
> ? pendingAssignmentsNodes
> : union(pendingAssignmentsNodes,
> stableAssignments.nodes());
> replicaMgr.replica(replicaGrpId)
> .thenApply(Replica::raftClient)
> .thenAccept(raftClient ->
> raftClient.updateConfiguration(fromAssignments(newAssignments)));
> }), ioExecutor);
> {code}
> On node that has been returning from \{{TestPlacementDriver}} will pass
> \{{isLocalNodeIsPrimary}} check and all follow checks in any case, but the
> node doesn't host a replication group, then there no replica future and then
> \{{replicaMgr#replica}} returns \{{null}} and then \{{NPE}} on
> \{{null}}-value is thrown.
> The solution is to add to \{{TestPlacementDriver}} kind of mapping of
> \{{ZonePartitionId}} to \{{ClusterNode}} of "primary" replica host node. But
> there is an another problem: in debug we can see 25 partitions for zone 0. At
> least not very suit to write 25 mappings in the map, but zone 0 is a common
> public zone and is a subject of the test. Then, the solution is to reduce
> default's zone partition number or add mapping for all it's partitions.
> *Motivation*
> The crucial test should be fixed.
> *Definition of done*
> The test is passed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)