Hans,

We are done with verifying the patch-series updated in the ticket: 1050 and the results are good. Below mail has my review comment and consider this as an official Ack on these patches.

Thanks and Regards,
Ramesh.

On 9/18/2014 11:03 AM, ramesh betham wrote:
Hi Hans,

Thanks for providing the traces. These traces gave more clarity about the race condition happening between authentication and TIPC sockets.

I Ack for the latest patch with one comment:

    /There can be a memleak, if the client process exits after the
    expiry of MDS DOWN_TMR (and with out calling
    mds_auth_server_disconnect()). So a simple function to check for
    the stale (i.e., no PID exist) process_info structs and delete
    them in mds_register_callback() may help. /

This latest patch stabilizes authentication feature and I sincerely appreciate for listening to my review comments.

Best Regards,
Ramesh.

On 9/17/2014 7:33 PM, Hans Feldt wrote:

  * *Comment*:

Here's a trace snippet from an opensaf start that it is hard to explain...

Sep 8 13:47:55.777790 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.777801 osafimmnd [5233:mds_c_db.c:2352] >> mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.777987 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.778006 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.792541 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.792557 osafimmnd [5233:mds_c_db.c:2361] >> mds_process_info_del: dest:2020f53b80025, pid:5335

Sep 8 13:47:55.792655 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.792679 osafimmnd [5233:mds_c_db.c:2352] >> mds_process_info_add: dest:2020f53b80025, pid:0 Sep 8 13:47:55.792701 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.792945 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.811859 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 from 2020f53b80025, pid 5335 Sep 8 13:47:55.811903 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 already exist Sep 8 13:47:55.811994 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.812008 osafimmnd [5233:mds_c_db.c:2361] >> mds_process_info_del: dest:2020f53b80025, pid:5335 Sep 8 13:47:55.812091 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info NOTEXIST, svc:26, adest:2020f53b80025 Sep 8 13:47:55.812104 osafimmnd [5233:mds_c_db.c:2352] >> mds_process_info_add: dest:2020f53b80025, pid:0

Sep 8 13:47:55.812194 osafimmnd [5233:immnd_evt.c:0726] WA immnd_evt_proc_imm_init: PID 0 (5335) for 2020f53b80025, MDS problem? Sep 8 13:47:55.812742 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, adest:2020f53b80025 Sep 8 13:47:55.812760 osafimmnd [5233:mds_c_db.c:2361] >> mds_process_info_del: dest:2020f53b80025, pid:0

pid:5335 is amfnd

------------------------------------------------------------------------

*[tickets:#1050] <http://sourceforge.net/p/opensaf/tickets/1050> amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize*

*Status:* review
*Milestone:* 4.5.0
*Created:* Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt
*Last Updated:* Mon Sep 15, 2014 01:45 PM UTC
*Owner:* Hans Feldt

With MDS/TIPC amfnd randomly fails to start causing failed opensaf start.

osafimmnd logs the infamous "immnd_evt_proc_imm_init: ... MDS problem?"

Reason is a random timing variation of the TIPC topology DOWN event. This sometimes causes the DOWN event to wrongly delete a newly added process_info entry.

The trigger for this problem is that some IMM clients in opensaf like amfnd does not reuse IMM handles but initialize/finalize in a far from optimal way. This should also be fixed.

The solution under test consists of two parts:
1) The MDS down event just starts a timer in MDS, when the timeout event happens the process_info entry is deleted.

2) A new explicit disconnect() is added to the MDS API which is used by IMMA library when it is about to close down the whole core library.

------------------------------------------------------------------------

Sent from sourceforge.net because [email protected] is subscribed to https://sourceforge.net/p/opensaf/tickets/ <https://sourceforge.net/p/opensaf/tickets>

To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.



------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk


_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


------------------------------------------------------------------------------
Slashdot TV.  Video for Nerds.  Stuff that Matters.
http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to