Setting an implementer during cluster merge or split‑brain recovery
could fail with SA_AIS_ERR_EXIST even though the coord accepted the new
implementer and the operation should be retried.
When this happens, visible symptoms include the imm agent timing out or
receiving a response that later leads to SA_AIS_ERR_EXIST on retries.
After a network merge, IMMND re-introduces and enters a sync phase with
the coordinator (mIntroduced == 2). While syncing, IMMND may drop the
set-implementer response but the coordinator can still accept and set the
implementer. This creates a mismatch: the imm agent either times out or
gets a stale/absent response, and when it retries the operation the IMMND
now reports SA_AIS_ERR_EXIST because the implementer is already set on
the coordinator.
Add an early check in case IMMND need to be synced with coord.
This prevents further processing while IMMND is syncing and makes
the retryable condition explicit so the agent can try again later.
---
src/imm/immnd/immnd_evt.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/src/imm/immnd/immnd_evt.c b/src/imm/immnd/immnd_evt.c
index 46cb85b31..088da129e 100644
--- a/src/imm/immnd/immnd_evt.c
+++ b/src/imm/immnd/immnd_evt.c
@@ -2888,6 +2888,13 @@ static uint32_t immnd_evt_proc_impl_set(IMMND_CB *cb,
IMMND_EVT *evt,
goto agent_rsp;
}
+ if (cb->mIntroduced == 2) {
+ TRACE_2("ERR_TRY_AGAIN: wait for IMMND sync up with coord");
+ send_evt.info.imma.info.implSetRsp.error =
+ SA_AIS_ERR_TRY_AGAIN;
+ goto agent_rsp;
+ }
+
if (evt->type == IMMND_EVT_A2ND_OI_IMPL_SET_2 &&
!immModel_protocol45Allowed(cb)) {
LOG_WA(
--
2.34.1
The information in this email is confidential and may be legally privileged. It
is intended solely for the addressee. Any opinions expressed are mine and do
not necessarily represent the opinions of the Company. Emails are susceptible
to interference. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in reliance on
it, is strictly prohibited and may be unlawful. If you have received this
message in error, do not open any attachments but please notify the Endava
Service Desk on (+44 (0)870 423 0187), and delete this message from your
system. The sender accepts no responsibility for information, errors or
omissions in this email, or for its use or misuse, or for any act committed or
omitted in connection with this communication. If in doubt, please verify the
authenticity of the contents with the sender. Please rely on your own virus
checkers as no responsibility is taken by the sender for any damage rising out
of any bug or virus infection.
Endava plc is a company registered in England under company number 5722669
whose registered office is at 125 Old Broad Street, London, EC2N 1AR, United
Kingdom. Endava plc is the Endava group holding company and does not provide
any services to clients. Each of Endava plc and its subsidiaries is a separate
legal entity and has no liability for another such entity's acts or omissions.
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel