I have the patch implementing the "INVALID HANDLE" idea which is similar to
#1179
Comments are welcome
--- Brief explanation ---
An observation on the MDS events in Normal View toward the NTF Agent by
following cases:
- Start ntfsubcribe
Nov 06 05:32:44 PL-3 ntfsubscribe: NO mds_cb_info->info.svc_evt.i_change 3
(NCSMDS_UP)
- Stop SC-1, failover
Nov 06 05:33:09 PL-3 ntfsubscribe: NO mds_cb_info->info.svc_evt.i_change 1
(NCSMDS_NO_ACTIVE)
Nov 06 05:33:09 PL-3 ntfsubscribe: NO NTFS down
- Active NTF Server is on SC-2
Nov 06 05:33:10 PL-3 ntfsubscribe: NO mds_cb_info->info.svc_evt.i_change 2
(NCSMDS_NEW_ACTIVE)
Nov 06 05:33:10 PL-3 ntfsubscribe: NO MSG from NTFS NCSMDS_NEW_ACTIVE/UP
- Stop SC-2
Nov 06 05:33:41 PL-3 ntfsubscribe: NO mds_cb_info->info.svc_evt.i_change 1
(NCSMDS_NO_ACTIVE)
Nov 06 05:33:41 PL-3 ntfsubscribe: NO NTFS down
- No Active NTF Server
Nov 06 05:33:41 PL-3 ntfsubscribe: NO mds_cb_info->info.svc_evt.i_change 4
(NCSMDS_DOWN)
Nov 06 05:33:41 PL-3 ntfsubscribe: NO NTFS down
- Start SC-1 again, Active NTF Server is on SC-1
Nov 06 05:34:11 PL-3 ntfsubscribe: NO mds_cb_info->info.svc_evt.i_change 2
(NCSMDS_NEW_ACTIVE)
Nov 06 05:34:11 PL-3 ntfsubscribe: NO MSG from NTFS NCSMDS_NEW_ACTIVE/UP
- Restart cluster, start ntfsubcribe, then only stop SC-2, no mds event
So the @ntfa_ntfsv_state_t is introduced to control the server states based on
the MDS event.
State handling:
- Initial value is NTFA_NTFSV_NONE
- If start a NTF client, Agent receives NCSMDS_UP, set @ntfa_ntfsv_state_t is
NTFA_NTFSV_UP
- At state NTFA_NTFSV_UP, all APIs are functioning normally if the handle
is valid
- Then if Active SC goes down, Agent will receive NCSMDS_NO_ACTIVE, set
@ntfa_ntfsv_state_t is NTFA_NTFSV_NO_ACTIVE
- At state NTFA_NTFSV_NO_ACTIVE, any APIs call will get returned code
TRY_AGAIN
- If NCSMDS_NEW_ACTIVE is coming afterwards, set @ntfa_ntfsv_state_t is
NTFA_NTFSV_UP
- All APIs are functioning normally with valid handle
- Else If NCSMDS_DOWN is coming, set @ntfa_ntfsv_state_t is NTFA_NTFSV_DOWN
- At state NTFA_NTFSV_DOWN:
- Return TRY_AGAIN for saNtfInitialize
- Return OK for saNtfFinalize
- Return BAD_HANDLE for all other APIs
- At state NTFA_NTFSV_DOWN, if one of SCs starts again, Agent will receive
NCSMDS_NEW_ACTIVE
- "Recovery" could be done at this point in time, yet it's not implemented
- So set @ntfa_ntfsv_state_t is NTFA_NTFSV_UP for now. All APIs are
functioning normally if the handle is valid.
- Any API call with "old" handle will receive BAD_HANDLE, this has already
been done by current implementation that APIs are calling ncshm_take_hdl() to
map the handle record.
Attachment: ntf_1180_invalhdl.patch.patch (23.5 kB; application/octet-stream)
---
** [tickets:#1180] NTF: Ntf service shall be able to recover if both SC nodes
goes down**
**Status:** unassigned
**Milestone:** 4.6.FC
**Created:** Mon Oct 20, 2014 01:29 PM UTC by elunlen
**Last Updated:** Mon Oct 20, 2014 01:29 PM UTC
**Owner:** elunlen
The Ntfservice shall be able to recover if both SC nodes goes down at the same
time. This is not possible today. A cluster restart is needed.
NOTE: This is also applicable for the LOG service. [#1179]
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets