Hi Minh,
Ack for the patch, code review only
Regards,
Ravi
-Original Message-
From: Minh Chau [mailto:minh.c...@dektech.com.au]
Sent: Thursday, April 26, 2018 4:52 AM
To: anders.wid...@ericsson.com; hans.nordeb...@ericsson.com;
ravisekhar.ko...@oracle.com
Cc: opensaf-devel@lists.source
Summary: clmd: Increase message priority of CLMSV_CLMS_MDS_NODE_EVT to be sent
to main thread [#2842]
Review request for Ticket(s): 2842
Peer Reviewer(s): *** Anders, Hans, Ravi
Pull request to: *** LIST THE PERSON WITH PUSH ACCESS HERE ***
Affected branch(es): develop
Development branch: ticket-2
In the event of stop/start standby controller, the node is stopped that
generates the MDS event CLMSV_CLMS_MDS_NODE_EVT. This event is being sent
to main thread with NORMAL priority. When the node is started again, the
other event like CLMSV_CLUSTER_JOIN_REQ is being sent with HIGH priority.
The r
Summary: msgd: handle abrupt restart of remote node [#2840]
Review request for Ticket(s): 2840
Peer Reviewer(s): Srinivas
Pull request to:
Affected branch(es): develop
Development branch: ticket-2840
Base revision: dd6a9bfe9d897fe9cc3a70e21d7e066b7a727d44
Personal repository: git://git.code.sf.net
Sometimes when a remote node restarts abruptly, queues which were created on
that node, are unable to be opened again when that node comes up.
There is a race condition when the remote node goes down between msgd getting
the CLM and MDS events indicating node down, and immd removing the implemente
Hi Alex,
ok I'll check if there is a problem, but immnd is restartable and should
be restarted after the nid phase is
finished.
After the nid phase the system should be in a "well defined" state.
That was one of the
reasons fifo monitoring was added to the nid phase.
/HansN
On 04/25/20
Hi Hans,
I understand. But, what if it doesn't fail in the nid phase?
If you run this command in your setup: "systemctl start opensafd;
sleep 2; pkill -KILL immnd", does immnd get restarted? And does
opensafd successfully come up according to systemd?
Alex
On 04/25/
Hi Alex,
the reboot should only happen if REBOOT_ON_FAIL_TIMEOUT is set, (i.e. not 0).
I checked the latest version, the reboot works fine if e.g. immnd fails in the
nid phase and REBOOT_ON_FAIL_TIMEOUT is set.
/Thanks HansN
From: Alex Jones [mailto:ajo...@rbbn.com]
Sent: den 25 april 2018 15:0
Hi Hans,
There must be a hole here, then. Because in our setup, if dtmd or
immnd crashes early in the startup process, the node doesn't reboot,
and the executables are not restarted. If I set "Restart=on-failure" it
works fine.
Can you test this in your setup to see if y
rded should not automatically include itself in the cluster member list.
Instead it should rely solely on AMFND service up, so that the count
is consistent across nodes.
Also adjust some split-brain prevention related values. More time
is required to ensure we should have an accurate view of clust
Summary: rded: prevent unnecessary takeover [#2843]
Review request for Ticket(s): 2843
Peer Reviewer(s): Anders, Hans, Ravi
Pull request to: *** LIST THE PERSON WITH PUSH ACCESS HERE ***
Affected branch(es): develop
Development branch: ticket-2843
Base revision: dd6a9bfe9d897fe9cc3a70e21d7e066b7a72
11 matches
Mail list logo