[
https://issues.apache.org/jira/browse/PHOENIX-7920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Himanshu Gwalani updated PHOENIX-7920:
--------------------------------------
Description:
Primary-side HA fix is enabling a direct ANISTS → AISTS transition (Ritesh
Garg). On the standby this manifests as a local HAGroupState transition
DEGRADED_STANDBY → STANDBY_TO_ACTIVE directly, skipping STANDBY. Replay side
has no listener for this path today: replicationReplayState stays at DEGRADED,
no rewind to lastRoundInSync happens, and shouldTriggerFailover() (line 493)
hard-blocks promotion forever because it requires state == SYNC.
File:
phoenix-core-server/src/main/java/org/apache/phoenix/replication/reader/ReplicationLogDiscoveryReplay.java
*Fix on three fronts (all must land together):*
**1. triggerFailoverListener (148-160): add
replicationReplayState.compareAndSet(DEGRADED, SYNCED_RECOVERY) before
failoverPending.set(true). Conditional CAS so the happy STANDBY →
STANDBY_TO_ACTIVE path doesn't pay a redundant rewind; failoverPending set runs
unconditionally so the signal is never lost.
2. initializeLastRoundProcessed() (215-263): add a parallel branch for
STANDBY_TO_ACTIVE so a reader restart in this state — when
lastSyncStateTimeInMs indicates prior DEGRADED — initializes lastRoundInSync
from lastSyncStateTimeInMs and sets state to SYNCED_RECOVERY. Without this,
restart after the direct transition silently skips files between the pre-crash
sync point and the crash, promoting with a hole.
3. Declare lastRoundProcessed and lastRoundInSync volatile to close a
visibility gap between the ZK watcher thread and the scheduler thread that the
new path makes more reachable.
*Dependencies:* Coordinate landing with Ritesh's primary-side change widening
HAGroupStoreRecord.HAGroupState.DEGRADED_STANDBY.allowedTransitions (currently
{STANDBY}) and the writer signaling for ANISTS → AISTS. Neither side ships the
new transition without the other.
*Tests:* 2 listener unit cases (CAS fires from DEGRADED, no-ops from SYNC),
full IT for the direct path with files in OUT, restart IT for crash
mid-transition, end-to-end cycle IT including ABORT_TO_STANDBY retry, and
update to HAGroupStoreRecordTest.testHAGroupStateValidTransitions.
was:
The replay service can get stuck in an infinite loop if there is a persistent
issue while processing older files in the in-progress directory.
{code:java}
files = replicationLogTracker.getOlderInProgressFiles(oldestTimestampToProcess);
while (!files.isEmpty()) {
processOneRandomFile(files);
files =
replicationLogTracker.getOlderInProgressFiles(oldestTimestampToProcess);
} {code}
> Add replay-side handling for direct DEGRADED_STANDBY → STANDBY_TO_ACTIVE
> transition (ANISTS → AISTS)
> ----------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-7920
> URL: https://issues.apache.org/jira/browse/PHOENIX-7920
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: Himanshu Gwalani
> Assignee: Himanshu Gwalani
> Priority: Major
>
> Primary-side HA fix is enabling a direct ANISTS → AISTS transition (Ritesh
> Garg). On the standby this manifests as a local HAGroupState transition
> DEGRADED_STANDBY → STANDBY_TO_ACTIVE directly, skipping STANDBY. Replay side
> has no listener for this path today: replicationReplayState stays at
> DEGRADED, no rewind to lastRoundInSync happens, and shouldTriggerFailover()
> (line 493) hard-blocks promotion forever because it requires state == SYNC.
> File:
> phoenix-core-server/src/main/java/org/apache/phoenix/replication/reader/ReplicationLogDiscoveryReplay.java
> *Fix on three fronts (all must land together):*
> **1. triggerFailoverListener (148-160): add
> replicationReplayState.compareAndSet(DEGRADED, SYNCED_RECOVERY) before
> failoverPending.set(true). Conditional CAS so the happy STANDBY →
> STANDBY_TO_ACTIVE path doesn't pay a redundant rewind; failoverPending set
> runs unconditionally so the signal is never lost.
> 2. initializeLastRoundProcessed() (215-263): add a parallel branch for
> STANDBY_TO_ACTIVE so a reader restart in this state — when
> lastSyncStateTimeInMs indicates prior DEGRADED — initializes lastRoundInSync
> from lastSyncStateTimeInMs and sets state to SYNCED_RECOVERY. Without this,
> restart after the direct transition silently skips files between the
> pre-crash sync point and the crash, promoting with a hole.
> 3. Declare lastRoundProcessed and lastRoundInSync volatile to close a
> visibility gap between the ZK watcher thread and the scheduler thread that
> the new path makes more reachable.
>
> *Dependencies:* Coordinate landing with Ritesh's primary-side change widening
> HAGroupStoreRecord.HAGroupState.DEGRADED_STANDBY.allowedTransitions
> (currently {STANDBY}) and the writer signaling for ANISTS → AISTS. Neither
> side ships the new transition without the other.
>
> *Tests:* 2 listener unit cases (CAS fires from DEGRADED, no-ops from SYNC),
> full IT for the direct path with files in OUT, restart IT for crash
> mid-transition, end-to-end cycle IT including ABORT_TO_STANDBY retry, and
> update to HAGroupStoreRecordTest.testHAGroupStateValidTransitions.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)