[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize
- **status**: review -- fixed - **assigned_to**: Hans Feldt -- nobody - **Comment**: changeset: 5845:e7c037863a2e branch: opensaf-4.5.x parent: 5843:633a2e496589 user:Hans Feldt hans.fe...@ericsson.com date:Fri Sep 19 12:40:26 2014 +0200 summary: mds: change process_info key to include svc_id [#1050] changeset: 5846:7f9c430348da branch: opensaf-4.5.x user:Hans Feldt hans.fe...@ericsson.com date:Fri Sep 19 12:40:36 2014 +0200 summary: mds: delete proc_info for non existing process after tmo [#1050] changeset: 5847:cee0964e0ed8 branch: opensaf-4.5.x user:Hans Feldt osafde...@gmail.com date:Wed Sep 10 11:15:48 2014 +0200 summary: mds: add mds_auth_server_disconnect() [#1050] changeset: 5848:5a5614999c4a branch: opensaf-4.5.x user:Hans Feldt osafde...@gmail.com date:Wed Sep 10 11:15:49 2014 +0200 summary: imma: use mds_auth_server_disconnect [#1050] changeset: 5849:26210f148ba2 parent: 5844:0631071a7053 user:Hans Feldt hans.fe...@ericsson.com date:Fri Sep 19 12:40:26 2014 +0200 summary: mds: change process_info key to include svc_id [#1050] changeset: 5850:1c4f1af50184 user:Hans Feldt hans.fe...@ericsson.com date:Fri Sep 19 12:40:36 2014 +0200 summary: mds: delete proc_info for non existing process after tmo [#1050] changeset: 5851:559e82c21e26 user:Hans Feldt osafde...@gmail.com date:Wed Sep 10 11:15:48 2014 +0200 summary: mds: add mds_auth_server_disconnect() [#1050] changeset: 5852:7dd084530461 tag: tip user:Hans Feldt osafde...@gmail.com date:Wed Sep 10 11:15:49 2014 +0200 summary: imma: use mds_auth_server_disconnect [#1050] --- ** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize** **Status:** fixed **Milestone:** 4.5.0 **Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt **Last Updated:** Wed Sep 17, 2014 02:03 PM UTC **Owner:** nobody With MDS/TIPC amfnd randomly fails to start causing failed opensaf start. osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem? Reason is a random timing variation of the TIPC topology DOWN event. This sometimes causes the DOWN event to wrongly delete a newly added process_info entry. The trigger for this problem is that some IMM clients in opensaf like amfnd does not reuse IMM handles but initialize/finalize in a far from optimal way. This should also be fixed. The solution under test consists of two parts: 1) The MDS down event just starts a timer in MDS, when the timeout event happens the process_info entry is deleted. 2) A new explicit disconnect() is added to the MDS API which is used by IMMA library when it is about to close down the whole core library. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Slashdot TV. Video for Nerds. Stuff that Matters. http://pubads.g.doubleclick.net/gampad/clk?id=160591471iu=/4140/ostg.clktrk___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
Re: [tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize
Did you not consider using a/the security-key exchanged between the client and server, as the 'key' to lookup/store from MDS? Mathi. - ramesh.bet...@oracle.com wrote: Hi Hans, Thanks for providing the traces. These traces gave more clarity about the race condition happening between authentication and TIPC sockets. I Ack for the latest patch with one comment: There can be a memleak, if the client process exits after the expiry of MDS DOWN_TMR (and with out calling mds_auth_server_disconnect()). So a simple function to check for the stale (i.e., no PID exist) process_info structs and delete them in mds_register_callback() may help. This latest patch stabilizes authentication feature and I sincerely appreciate for listening to my review comments. Best Regards, Ramesh. On 9/17/2014 7:33 PM, Hans Feldt wrote: • Comment : Here's a trace snippet from an opensaf start that it is hard to explain... Sep 8 13:47:55.90 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.777801 osafimmnd [5233:mds_c_db.c:2352] mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.777987 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.778006 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.792541 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.792557 osafimmnd [5233:mds_c_db.c:2361] mds_process_info_del: dest:2020f53b80025, pid:5335 Sep 8 13:47:55.792655 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.792679 osafimmnd [5233:mds_c_db.c:2352] mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.792701 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.792945 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.811859 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.811903 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.811994 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.812008 osafimmnd [5233:mds_c_db.c:2361] mds_process_info_del: dest:2020f53b80025, pid:5335 Sep 8 13:47:55.812091 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.812104 osafimmnd [5233:mds_c_db.c:2352] mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.812194 osafimmnd [5233:immnd_evt.c:0726] WA immnd_evt_proc_imm_init: PID 0 (5335) for 2020f53b80025, MDS problem? Sep 8 13:47:55.812742 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.812760 osafimmnd [5233:mds_c_db.c:2361] mds_process_info_del: dest:2020f53b80025, pid:0 pid:5335 is amfnd [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize Status: review Milestone: 4.5.0 Created: Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt Last Updated: Mon Sep 15, 2014 01:45 PM UTC Owner: Hans Feldt With MDS/TIPC amfnd randomly fails to start causing failed opensaf start. osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem? Reason is a random timing variation of the TIPC topology DOWN event. This sometimes causes the DOWN event to wrongly delete a newly added process_info entry. The trigger for this problem is that some IMM clients in opensaf like amfnd does not reuse IMM handles but initialize/finalize in a far from optimal way. This should also be fixed. The solution under test consists of two parts: 1) The MDS down event just starts a timer in MDS, when the timeout event happens the process_info entry is deleted. 2) A new explicit disconnect() is added to the MDS API which is used by IMMA library when it is about to close down the whole core library. Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. -- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk ___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets ___ Opensaf-tickets
Re: [tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize
No I used the (to me) standard mechanism available for local connected sockets that I was aware of. It is used in other similar situations. /Hans -Original Message- From: Mathivanan Naickan Palanivelu [mailto:mathi.naic...@oracle.com] Sent: den 18 september 2014 16:55 To: ramesh.bet...@oracle.com Cc: opensaf-tickets@lists.sourceforge.net Subject: Re: [tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize Did you not consider using a/the security-key exchanged between the client and server, as the 'key' to lookup/store from MDS? Mathi. - ramesh.bet...@oracle.com wrote: Hi Hans, Thanks for providing the traces. These traces gave more clarity about the race condition happening between authentication and TIPC sockets. I Ack for the latest patch with one comment: There can be a memleak, if the client process exits after the expiry of MDS DOWN_TMR (and with out calling mds_auth_server_disconnect()). So a simple function to check for the stale (i.e., no PID exist) process_info structs and delete them in mds_register_callback() may help. This latest patch stabilizes authentication feature and I sincerely appreciate for listening to my review comments. Best Regards, Ramesh. On 9/17/2014 7:33 PM, Hans Feldt wrote: • Comment : Here's a trace snippet from an opensaf start that it is hard to explain... Sep 8 13:47:55.90 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.777801 osafimmnd [5233:mds_c_db.c:2352] mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.777987 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.778006 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.792541 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.792557 osafimmnd [5233:mds_c_db.c:2361] mds_process_info_del: dest:2020f53b80025, pid:5335 Sep 8 13:47:55.792655 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.792679 osafimmnd [5233:mds_c_db.c:2352] mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.792701 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.792945 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.811859 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.811903 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.811994 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.812008 osafimmnd [5233:mds_c_db.c:2361] mds_process_info_del: dest:2020f53b80025, pid:5335 Sep 8 13:47:55.812091 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.812104 osafimmnd [5233:mds_c_db.c:2352] mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.812194 osafimmnd [5233:immnd_evt.c:0726] WA immnd_evt_proc_imm_init: PID 0 (5335) for 2020f53b80025, MDS problem? Sep 8 13:47:55.812742 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.812760 osafimmnd [5233:mds_c_db.c:2361] mds_process_info_del: dest:2020f53b80025, pid:0 pid:5335 is amfnd [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize Status: review Milestone: 4.5.0 Created: Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt Last Updated: Mon Sep 15, 2014 01:45 PM UTC Owner: Hans Feldt With MDS/TIPC amfnd randomly fails to start causing failed opensaf start. osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem? Reason is a random timing variation of the TIPC topology DOWN event. This sometimes causes the DOWN event to wrongly delete a newly added process_info entry. The trigger for this problem is that some IMM clients in opensaf like amfnd does not reuse IMM handles but initialize/finalize in a far from optimal way. This should also be fixed. The solution under test consists of two parts: 1) The MDS down event just starts a timer in MDS, when the timeout event happens the process_info entry is deleted. 2) A new explicit disconnect() is added to the MDS API which is used by IMMA library when it is about to close down the whole core library. Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing
[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize
- **Comment**: Here's a trace snippet from an opensaf start that it is hard to explain... Sep 8 13:47:55.90 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.777801 osafimmnd [5233:mds_c_db.c:2352] mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.777987 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.778006 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.792541 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.792557 osafimmnd [5233:mds_c_db.c:2361] mds_process_info_del: dest:2020f53b80025, pid:5335 Sep 8 13:47:55.792655 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.792679 osafimmnd [5233:mds_c_db.c:2352] mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.792701 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.792945 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.811859 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.811903 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.811994 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.812008 osafimmnd [5233:mds_c_db.c:2361] mds_process_info_del: dest:2020f53b80025, pid:5335 Sep 8 13:47:55.812091 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.812104 osafimmnd [5233:mds_c_db.c:2352] mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.812194 osafimmnd [5233:immnd_evt.c:0726] WA immnd_evt_proc_imm_init: PID 0 (5335) for 2020f53b80025, MDS problem? Sep 8 13:47:55.812742 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.812760 osafimmnd [5233:mds_c_db.c:2361] mds_process_info_del: dest:2020f53b80025, pid:0 pid:5335 is amfnd --- ** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize** **Status:** review **Milestone:** 4.5.0 **Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt **Last Updated:** Mon Sep 15, 2014 01:45 PM UTC **Owner:** Hans Feldt With MDS/TIPC amfnd randomly fails to start causing failed opensaf start. osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem? Reason is a random timing variation of the TIPC topology DOWN event. This sometimes causes the DOWN event to wrongly delete a newly added process_info entry. The trigger for this problem is that some IMM clients in opensaf like amfnd does not reuse IMM handles but initialize/finalize in a far from optimal way. This should also be fixed. The solution under test consists of two parts: 1) The MDS down event just starts a timer in MDS, when the timeout event happens the process_info entry is deleted. 2) A new explicit disconnect() is added to the MDS API which is used by IMMA library when it is about to close down the whole core library. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
Re: [tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize
Hi Hans, Thanks for providing the traces. These traces gave more clarity about the race condition happening between authentication and TIPC sockets. I Ack for the latest patch with one comment: /There can be a memleak, if the client process exits after the expiry of MDS DOWN_TMR (and with out calling mds_auth_server_disconnect()). So a simple function to check for the stale (i.e., no PID exist) process_info structs and delete them in mds_register_callback() may help. / This latest patch stabilizes authentication feature and I sincerely appreciate for listening to my review comments. Best Regards, Ramesh. On 9/17/2014 7:33 PM, Hans Feldt wrote: * *Comment*: Here's a trace snippet from an opensaf start that it is hard to explain... Sep 8 13:47:55.90 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.777801 osafimmnd [5233:mds_c_db.c:2352] mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.777987 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.778006 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.792541 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.792557 osafimmnd [5233:mds_c_db.c:2361] mds_process_info_del: dest:2020f53b80025, pid:5335 Sep 8 13:47:55.792655 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.792679 osafimmnd [5233:mds_c_db.c:2352] mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.792701 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.792945 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.811859 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.811903 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.811994 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.812008 osafimmnd [5233:mds_c_db.c:2361] mds_process_info_del: dest:2020f53b80025, pid:5335 Sep 8 13:47:55.812091 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.812104 osafimmnd [5233:mds_c_db.c:2352] mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.812194 osafimmnd [5233:immnd_evt.c:0726] WA immnd_evt_proc_imm_init: PID 0 (5335) for 2020f53b80025, MDS problem? Sep 8 13:47:55.812742 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.812760 osafimmnd [5233:mds_c_db.c:2361] mds_process_info_del: dest:2020f53b80025, pid:0 pid:5335 is amfnd *[tickets:#1050] http://sourceforge.net/p/opensaf/tickets/1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize* *Status:* review *Milestone:* 4.5.0 *Created:* Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt *Last Updated:* Mon Sep 15, 2014 01:45 PM UTC *Owner:* Hans Feldt With MDS/TIPC amfnd randomly fails to start causing failed opensaf start. osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem? Reason is a random timing variation of the TIPC topology DOWN event. This sometimes causes the DOWN event to wrongly delete a newly added process_info entry. The trigger for this problem is that some IMM clients in opensaf like amfnd does not reuse IMM handles but initialize/finalize in a far from optimal way. This should also be fixed. The solution under test consists of two parts: 1) The MDS down event just starts a timer in MDS, when the timeout event happens the process_info entry is deleted. 2) A new explicit disconnect() is added to the MDS API which is used by IMMA library when it is about to close down the whole core library. Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ https://sourceforge.net/p/opensaf/tickets To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. -- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk ___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize
Please check/test these patches Attachment: osaf-1050.tgz (5.5 kB; application/x-compressed-tar) --- ** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize** **Status:** review **Milestone:** 4.5.0 **Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt **Last Updated:** Fri Sep 12, 2014 09:26 AM UTC **Owner:** Hans Feldt With MDS/TIPC amfnd randomly fails to start causing failed opensaf start. osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem? Reason is a random timing variation of the TIPC topology DOWN event. This sometimes causes the DOWN event to wrongly delete a newly added process_info entry. The trigger for this problem is that some IMM clients in opensaf like amfnd does not reuse IMM handles but initialize/finalize in a far from optimal way. This should also be fixed. The solution under test consists of two parts: 1) The MDS down event just starts a timer in MDS, when the timeout event happens the process_info entry is deleted. 2) A new explicit disconnect() is added to the MDS API which is used by IMMA library when it is about to close down the whole core library. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize
changeset: 5766:00a8950e6888 branch: opensaf-4.5.x parent: 5764:ed452ef6f6d4 user:Hans Feldt hans.fe...@ericsson.com date:Thu Sep 11 15:42:47 2014 +0200 summary: imma: fix potential race with is_immnd_up [#1050] changeset: 5767:e7bad9ce4537 tag: tip parent: 5765:d09a52b10727 user:Hans Feldt hans.fe...@ericsson.com date:Thu Sep 11 15:42:47 2014 +0200 summary: imma: fix potential race with is_immnd_up [#1050] --- ** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize** **Status:** review **Milestone:** 4.5.0 **Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt **Last Updated:** Wed Sep 10, 2014 09:16 AM UTC **Owner:** Hans Feldt With MDS/TIPC amfnd randomly fails to start causing failed opensaf start. osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem? Reason is a random timing variation of the TIPC topology DOWN event. This sometimes causes the DOWN event to wrongly delete a newly added process_info entry. The trigger for this problem is that some IMM clients in opensaf like amfnd does not reuse IMM handles but initialize/finalize in a far from optimal way. This should also be fixed. The solution under test consists of two parts: 1) The MDS down event just starts a timer in MDS, when the timeout event happens the process_info entry is deleted. 2) A new explicit disconnect() is added to the MDS API which is used by IMMA library when it is about to close down the whole core library. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize
- **status**: accepted -- review --- ** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize** **Status:** review **Milestone:** 4.5.0 **Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt **Last Updated:** Tue Sep 09, 2014 07:08 AM UTC **Owner:** Hans Feldt With MDS/TIPC amfnd randomly fails to start causing failed opensaf start. osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem? Reason is a random timing variation of the TIPC topology DOWN event. This sometimes causes the DOWN event to wrongly delete a newly added process_info entry. The trigger for this problem is that some IMM clients in opensaf like amfnd does not reuse IMM handles but initialize/finalize in a far from optimal way. This should also be fixed. The solution under test consists of two parts: 1) The MDS down event just starts a timer in MDS, when the timeout event happens the process_info entry is deleted. 2) A new explicit disconnect() is added to the MDS API which is used by IMMA library when it is about to close down the whole core library. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize
--- ** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize** **Status:** accepted **Milestone:** 4.5.0 **Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt **Last Updated:** Tue Sep 09, 2014 07:08 AM UTC **Owner:** Hans Feldt With MDS/TIPC amfnd randomly fails to start causing failed opensaf start. osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem? Reason is a random timing variation of the TIPC topology DOWN event. This sometimes causes the DOWN event to wrongly delete a newly added process_info entry. The trigger for this problem is that some IMM clients in opensaf like amfnd does not reuse IMM handles but initialize/finalize in a far from optimal way. This should also be fixed. The solution under test consists of two parts: 1) The MDS down event just starts a timer in MDS, when the timeout event happens the process_info entry is deleted. 2) A new explicit disconnect() is added to the MDS API which is used by IMMA library when it is about to close down the whole core library. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce. Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets