osaf/services/saf/ntfsv/README.HYDRA | 110 +++++++++++++++++++++++++++++++++++ 1 files changed, 110 insertions(+), 0 deletions(-)
Add description regarding general solution and API implementation for cloud resilience support in NTF diff --git a/osaf/services/saf/ntfsv/README.HYDRA b/osaf/services/saf/ntfsv/README.HYDRA new file mode 100644 --- /dev/null +++ b/osaf/services/saf/ntfsv/README.HYDRA @@ -0,0 +1,110 @@ +SCs outage support in NTF +=============================== +https://sourceforge.net/p/opensaf/tickets/1180/ + +General +------- + +As support of cloud deployment use case which both SCs possibly are brought down, +the NTF service does not have to provide its full functionality during SCs outage +but it must responsively maintain the interface toward NTF client. It's aimed to +make the NTF client not being aware of SCs outage, some APIs provided to client +are just temporarily unavailable. Finally, NTF functionality as well as all NTF +APIs can resume operationally after one of SCs comes up. This requires the NTF +client incorporates to implement retry mechanism, which has already been +documented in NTF Programmer Guide (4.5). + +Solution +-------- + +The proposed solution must have the following implementation: +1. NTF Agent must return SA_AIS_ERR_TRY_AGAIN during SCs outage in most of APIs +required to communicate with NTF server. + +2. Once SC comes up (also known as NTF server is started), NTF Agent will silently +register to NTF server all information had been known between NTF client and NTF +server, which are: + 2.1 Register client id for all producers, subscribers, readers + NTF Agent sends NTFSV_INITIALIZE_REQ to obtain the new client id. This new + client id will also be used to recover subscriber and reader. + 2.2 Register subscription plus associated filters for all subscribers + The messages NTFSV_SUBSCRIBE_REQ/NTFSV_SUBSCRIBE_RSP can be reused to send + to NTF server. + 2.3 Register reader plus associated filters for all readers + The messages NTFSV_READER_INITIALIZE_REQ_2/NTFSV_READER_INITIALIZE_RSP can + be reused to send to NTF server. + As these registrations are done successfully, the producers can continue sending +notifications, subscribers can receive notification callbacks, readers can read +notifications satisfied filter criteria. + +3. As NTF server is started, it will restore all notifications written in persistent +storage. The storage has not been decided yet, it could rely on IMM PBE, or it +introduces new persistent file managed by NTF server. In anyhow NTF server shall be +modified to load/save notification records. + +Affected APIs and behaviours +---------------------------- + +Once NTF Agent detects the NTF server down completely, all client_hdl are set as +invalid +1. saNtfInitialize + Return SA_AIS_ERR_TRY_AGAIN if NTF server is unavailable (No Active due to + fail-over) or completely down. + +2. saNtfFinalize + If NTF server is unavailable, return SA_AIS_ERR_TRY_AGAIN. + If NTF server is down, silently clean up client handle and return SA_AIS_OK. + +3. saNtfNotificationSend + Return SA_AIS_ERR_TRY_AGAIN if NTF server is unavailable or completely down. + If NTF server is up, register all invalid clients. + If client registration is done, notification can continuously be sent. + +4. saNtfNotificationSubscribe + Return SA_AIS_ERR_TRY_AGAIN if NTF server is unavailable or completely down. + If NTF server is up, register all invalid clients. + If client registration is done, client can successfully subscribe for notification. + +5. saNtfNotificationUnsubscribe + Return SA_AIS_ERR_TRY_AGAIN if NTF server is unavailable. + If NTF server is down, silently clean up subscriber information then return + SA_AIS_OK. + If NTF server is up and client is invalid, silently clean up subscriber, also + register all invalid clients and return SA_AIS_OK if all operations succeed. + +6. saNtfNotificationReadInitialize + Return SA_AIS_ERR_TRY_AGAIN if NTF server is unavailable or completely down. + If NTF server is up, register all invalid clients. + If client registration is done, client can successfully initialize a new reader + +7. saNtfNotificationReadNext + Return SA_AIS_ERR_TRY_AGAIN if NTF server is unavailable or completely down. + If NTF server is up but client handle is invalidated, register all clients and + initialize the reader again. The parameter notification in saNtfNotificationReadNext + can be considered as the last notification successfully been read, from which + NTF Agent can use to find the next notification that the client wishes. However, + this continuous read is not supported since notifications are not preserved + after SC outage. + +8. saNtfNotificationReadFinalize + Return SA_AIS_ERR_TRY_AGAIN if NTF server is unavailable. + If NTF server is down, silently clean up reader information then return SA_AIS_OK + If NTF server is up and client is invalid, silently clean up reader, also register + all invalid clients and return SA_AIS_OK if all operations succeed. + +9. Callbacks + Once NTF Agent detects that the NTF Server is up after period of outage, NTF Agent + silently registers all invalid clients and subscribe for notification if the client + is a subscriber. + +Non-affected APIs and explanations +---------------------------------- +The following APIs do not require communication with NTF server, also the information +these APIs manipulate are cached locally in NTF Agent. Therefore they can be performed +normally despite of NTF server state. + +1. Notification allocation APIs for Producer +2. Filter allocation APIs for Consumer +3. saNtfNotificationFree, saNtfNotificationFilterFree +4. saNtfPtrValGet, saNtfArrayValGet +5. saNtfDispatch, saNtfSelectionObjectGet ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel