[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-19 Thread Hans Feldt
- **status**: review -- fixed
- **assigned_to**: Hans Feldt --  nobody 
- **Comment**:

changeset:   5845:e7c037863a2e
branch:  opensaf-4.5.x
parent:  5843:633a2e496589
user:Hans Feldt hans.fe...@ericsson.com
date:Fri Sep 19 12:40:26 2014 +0200
summary: mds: change process_info key to include svc_id [#1050]

changeset:   5846:7f9c430348da
branch:  opensaf-4.5.x
user:Hans Feldt hans.fe...@ericsson.com
date:Fri Sep 19 12:40:36 2014 +0200
summary: mds: delete proc_info for non existing process after tmo [#1050]

changeset:   5847:cee0964e0ed8
branch:  opensaf-4.5.x
user:Hans Feldt osafde...@gmail.com
date:Wed Sep 10 11:15:48 2014 +0200
summary: mds: add mds_auth_server_disconnect() [#1050]

changeset:   5848:5a5614999c4a
branch:  opensaf-4.5.x
user:Hans Feldt osafde...@gmail.com
date:Wed Sep 10 11:15:49 2014 +0200
summary: imma: use mds_auth_server_disconnect [#1050]

changeset:   5849:26210f148ba2
parent:  5844:0631071a7053
user:Hans Feldt hans.fe...@ericsson.com
date:Fri Sep 19 12:40:26 2014 +0200
summary: mds: change process_info key to include svc_id [#1050]

changeset:   5850:1c4f1af50184
user:Hans Feldt hans.fe...@ericsson.com
date:Fri Sep 19 12:40:36 2014 +0200
summary: mds: delete proc_info for non existing process after tmo [#1050]

changeset:   5851:559e82c21e26
user:Hans Feldt osafde...@gmail.com
date:Wed Sep 10 11:15:48 2014 +0200
summary: mds: add mds_auth_server_disconnect() [#1050]

changeset:   5852:7dd084530461
tag: tip
user:Hans Feldt osafde...@gmail.com
date:Wed Sep 10 11:15:49 2014 +0200
summary: imma: use mds_auth_server_disconnect [#1050]




---

** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from 
saImmOmInitialize**

**Status:** fixed
**Milestone:** 4.5.0
**Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt
**Last Updated:** Wed Sep 17, 2014 02:03 PM UTC
**Owner:** nobody

With MDS/TIPC amfnd randomly fails to start causing failed opensaf start.

osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem?

Reason is a random timing variation of the TIPC topology DOWN event. This 
sometimes causes the DOWN event to wrongly delete a newly added process_info 
entry.

The trigger for this problem is that some IMM clients in opensaf like amfnd 
does not reuse IMM handles but initialize/finalize in a far from optimal way. 
This should also be fixed.

The solution under test consists of two parts:
1) The MDS down event just starts a timer in MDS, when the timeout event 
happens the process_info entry is deleted.

2) A new explicit disconnect() is added to the MDS API which is used by IMMA 
library when it is about to close down the whole core library.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Slashdot TV.  Video for Nerds.  Stuff that Matters.
http://pubads.g.doubleclick.net/gampad/clk?id=160591471iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


Re: [tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-18 Thread Mathivanan Naickan Palanivelu

Did you not consider using a/the security-key exchanged between the client and 
server, as the 'key'
to lookup/store from MDS?

Mathi.

- ramesh.bet...@oracle.com wrote:

 Hi Hans,
 
 Thanks for providing the traces. These traces gave more clarity about
 the race condition happening between authentication and TIPC sockets.
 
 I Ack for the latest patch with one comment:
 
 
 There can be a memleak, if the client process exits after the expiry
 of MDS DOWN_TMR (and with out calling mds_auth_server_disconnect()).
 So a simple function to check for the stale (i.e., no PID exist)
 process_info structs and delete them in mds_register_callback() may
 help.
 This latest patch stabilizes authentication feature and I sincerely
 appreciate for listening to my review comments.
 
 Best Regards,
 Ramesh.
 
 
 On 9/17/2014 7:33 PM, Hans Feldt wrote:
 
 
 
 
 
 • Comment :
 
 
 Here's a trace snippet from an opensaf start that it is hard to
 explain...
 
 Sep 8 13:47:55.90 osafimmnd [5233:mds_c_api.c:1614] TR svc UP
 process_info NOTEXIST, svc:26, adest:2020f53b80025
 Sep 8 13:47:55.777801 osafimmnd [5233:mds_c_db.c:2352] 
 mds_process_info_add: dest:2020f53b80025, pid:0
 Sep 8 13:47:55.777987 osafimmnd [5233:mds_main.c:0151] TR mds:
 received 77 from 2020f53b80025, pid 5335
 Sep 8 13:47:55.778006 osafimmnd [5233:mds_main.c:0167] TR dest
 2020f53b80025 already exist
 Sep 8 13:47:55.792541 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN
 cnt:0, adest:2020f53b80025
 Sep 8 13:47:55.792557 osafimmnd [5233:mds_c_db.c:2361] 
 mds_process_info_del: dest:2020f53b80025, pid:5335
 
 Sep 8 13:47:55.792655 osafimmnd [5233:mds_c_api.c:1614] TR svc UP
 process_info NOTEXIST, svc:26, adest:2020f53b80025
 Sep 8 13:47:55.792679 osafimmnd [5233:mds_c_db.c:2352] 
 mds_process_info_add: dest:2020f53b80025, pid:0
 Sep 8 13:47:55.792701 osafimmnd [5233:mds_main.c:0151] TR mds:
 received 77 from 2020f53b80025, pid 5335
 Sep 8 13:47:55.792945 osafimmnd [5233:mds_main.c:0167] TR dest
 2020f53b80025 already exist
 Sep 8 13:47:55.811859 osafimmnd [5233:mds_main.c:0151] TR mds:
 received 77 from 2020f53b80025, pid 5335
 Sep 8 13:47:55.811903 osafimmnd [5233:mds_main.c:0167] TR dest
 2020f53b80025 already exist
 Sep 8 13:47:55.811994 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN
 cnt:0, adest:2020f53b80025
 Sep 8 13:47:55.812008 osafimmnd [5233:mds_c_db.c:2361] 
 mds_process_info_del: dest:2020f53b80025, pid:5335
 Sep 8 13:47:55.812091 osafimmnd [5233:mds_c_api.c:1614] TR svc UP
 process_info NOTEXIST, svc:26, adest:2020f53b80025
 Sep 8 13:47:55.812104 osafimmnd [5233:mds_c_db.c:2352] 
 mds_process_info_add: dest:2020f53b80025, pid:0
 
 Sep 8 13:47:55.812194 osafimmnd [5233:immnd_evt.c:0726] WA
 immnd_evt_proc_imm_init: PID 0 (5335) for 2020f53b80025, MDS problem?
 Sep 8 13:47:55.812742 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN
 cnt:0, adest:2020f53b80025
 Sep 8 13:47:55.812760 osafimmnd [5233:mds_c_db.c:2361] 
 mds_process_info_del: dest:2020f53b80025, pid:0
 
 pid:5335 is amfnd
 
 
 [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from
 saImmOmInitialize
 
 Status: review
 Milestone: 4.5.0
 Created: Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt
 Last Updated: Mon Sep 15, 2014 01:45 PM UTC
 Owner: Hans Feldt
 
 With MDS/TIPC amfnd randomly fails to start causing failed opensaf
 start.
 
 osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS
 problem?
 
 Reason is a random timing variation of the TIPC topology DOWN event.
 This sometimes causes the DOWN event to wrongly delete a newly added
 process_info entry.
 
 The trigger for this problem is that some IMM clients in opensaf like
 amfnd does not reuse IMM handles but initialize/finalize in a far from
 optimal way. This should also be fixed.
 
 The solution under test consists of two parts:
 1) The MDS down event just starts a timer in MDS, when the timeout
 event happens the process_info entry is deleted.
 
 2) A new explicit disconnect() is added to the MDS API which is used
 by IMMA library when it is about to close down the whole core library.
 
 
 Sent from sourceforge.net because
 opensaf-tickets@lists.sourceforge.net is subscribed to
 https://sourceforge.net/p/opensaf/tickets/
 
 To unsubscribe from further messages, a project admin can change
 settings at https://sourceforge.net/p/opensaf/admin/tickets/options.
 Or, if this is a mailing list, you can unsubscribe from the mailing
 list.
 
 --
 Want excitement?
 Manually upgrade your production database.
 When you want reliability, choose Perforce
 Perforce version control. Predictably reliable.
 http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk
 
 ___
 Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
 
 ___
 Opensaf-tickets 

Re: [tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-18 Thread Hans Feldt
No I used the (to me) standard mechanism available for local connected sockets 
that I was aware of. It is used in other similar situations.
/Hans

 -Original Message-
 From: Mathivanan Naickan Palanivelu [mailto:mathi.naic...@oracle.com]
 Sent: den 18 september 2014 16:55
 To: ramesh.bet...@oracle.com
 Cc: opensaf-tickets@lists.sourceforge.net
 Subject: Re: [tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start 
 due to ERR_LIBRARY from saImmOmInitialize
 
 
 Did you not consider using a/the security-key exchanged between the client 
 and server, as the 'key'
 to lookup/store from MDS?
 
 Mathi.
 
 - ramesh.bet...@oracle.com wrote:
 
  Hi Hans,
 
  Thanks for providing the traces. These traces gave more clarity about
  the race condition happening between authentication and TIPC sockets.
 
  I Ack for the latest patch with one comment:
 
 
  There can be a memleak, if the client process exits after the expiry
  of MDS DOWN_TMR (and with out calling mds_auth_server_disconnect()).
  So a simple function to check for the stale (i.e., no PID exist)
  process_info structs and delete them in mds_register_callback() may
  help.
  This latest patch stabilizes authentication feature and I sincerely
  appreciate for listening to my review comments.
 
  Best Regards,
  Ramesh.
 
 
  On 9/17/2014 7:33 PM, Hans Feldt wrote:
 
 
 
 
 
  • Comment :
 
 
  Here's a trace snippet from an opensaf start that it is hard to
  explain...
 
  Sep 8 13:47:55.90 osafimmnd [5233:mds_c_api.c:1614] TR svc UP
  process_info NOTEXIST, svc:26, adest:2020f53b80025
  Sep 8 13:47:55.777801 osafimmnd [5233:mds_c_db.c:2352] 
  mds_process_info_add: dest:2020f53b80025, pid:0
  Sep 8 13:47:55.777987 osafimmnd [5233:mds_main.c:0151] TR mds:
  received 77 from 2020f53b80025, pid 5335
  Sep 8 13:47:55.778006 osafimmnd [5233:mds_main.c:0167] TR dest
  2020f53b80025 already exist
  Sep 8 13:47:55.792541 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN
  cnt:0, adest:2020f53b80025
  Sep 8 13:47:55.792557 osafimmnd [5233:mds_c_db.c:2361] 
  mds_process_info_del: dest:2020f53b80025, pid:5335
 
  Sep 8 13:47:55.792655 osafimmnd [5233:mds_c_api.c:1614] TR svc UP
  process_info NOTEXIST, svc:26, adest:2020f53b80025
  Sep 8 13:47:55.792679 osafimmnd [5233:mds_c_db.c:2352] 
  mds_process_info_add: dest:2020f53b80025, pid:0
  Sep 8 13:47:55.792701 osafimmnd [5233:mds_main.c:0151] TR mds:
  received 77 from 2020f53b80025, pid 5335
  Sep 8 13:47:55.792945 osafimmnd [5233:mds_main.c:0167] TR dest
  2020f53b80025 already exist
  Sep 8 13:47:55.811859 osafimmnd [5233:mds_main.c:0151] TR mds:
  received 77 from 2020f53b80025, pid 5335
  Sep 8 13:47:55.811903 osafimmnd [5233:mds_main.c:0167] TR dest
  2020f53b80025 already exist
  Sep 8 13:47:55.811994 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN
  cnt:0, adest:2020f53b80025
  Sep 8 13:47:55.812008 osafimmnd [5233:mds_c_db.c:2361] 
  mds_process_info_del: dest:2020f53b80025, pid:5335
  Sep 8 13:47:55.812091 osafimmnd [5233:mds_c_api.c:1614] TR svc UP
  process_info NOTEXIST, svc:26, adest:2020f53b80025
  Sep 8 13:47:55.812104 osafimmnd [5233:mds_c_db.c:2352] 
  mds_process_info_add: dest:2020f53b80025, pid:0
 
  Sep 8 13:47:55.812194 osafimmnd [5233:immnd_evt.c:0726] WA
  immnd_evt_proc_imm_init: PID 0 (5335) for 2020f53b80025, MDS problem?
  Sep 8 13:47:55.812742 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN
  cnt:0, adest:2020f53b80025
  Sep 8 13:47:55.812760 osafimmnd [5233:mds_c_db.c:2361] 
  mds_process_info_del: dest:2020f53b80025, pid:0
 
  pid:5335 is amfnd
 
 
  [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from
  saImmOmInitialize
 
  Status: review
  Milestone: 4.5.0
  Created: Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt
  Last Updated: Mon Sep 15, 2014 01:45 PM UTC
  Owner: Hans Feldt
 
  With MDS/TIPC amfnd randomly fails to start causing failed opensaf
  start.
 
  osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS
  problem?
 
  Reason is a random timing variation of the TIPC topology DOWN event.
  This sometimes causes the DOWN event to wrongly delete a newly added
  process_info entry.
 
  The trigger for this problem is that some IMM clients in opensaf like
  amfnd does not reuse IMM handles but initialize/finalize in a far from
  optimal way. This should also be fixed.
 
  The solution under test consists of two parts:
  1) The MDS down event just starts a timer in MDS, when the timeout
  event happens the process_info entry is deleted.
 
  2) A new explicit disconnect() is added to the MDS API which is used
  by IMMA library when it is about to close down the whole core library.
 
 
  Sent from sourceforge.net because
  opensaf-tickets@lists.sourceforge.net is subscribed to
  https://sourceforge.net/p/opensaf/tickets/
 
  To unsubscribe from further messages, a project admin can change
  settings at https://sourceforge.net/p/opensaf/admin/tickets/options.
  Or, if this is a mailing list, you can unsubscribe from the mailing

[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-17 Thread Hans Feldt
- **Comment**:

Here's a trace snippet from an opensaf start that it is hard to explain...

Sep  8 13:47:55.90 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info 
NOTEXIST, svc:26, adest:2020f53b80025
Sep  8 13:47:55.777801 osafimmnd [5233:mds_c_db.c:2352]  
mds_process_info_add: dest:2020f53b80025, pid:0
Sep  8 13:47:55.777987 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 
from 2020f53b80025, pid 5335
Sep  8 13:47:55.778006 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 
already exist
Sep  8 13:47:55.792541 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, 
adest:2020f53b80025
Sep  8 13:47:55.792557 osafimmnd [5233:mds_c_db.c:2361]  
mds_process_info_del: dest:2020f53b80025, pid:5335

Sep  8 13:47:55.792655 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info 
NOTEXIST, svc:26, adest:2020f53b80025
Sep  8 13:47:55.792679 osafimmnd [5233:mds_c_db.c:2352]  
mds_process_info_add: dest:2020f53b80025, pid:0
Sep  8 13:47:55.792701 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 
from 2020f53b80025, pid 5335
Sep  8 13:47:55.792945 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 
already exist
Sep  8 13:47:55.811859 osafimmnd [5233:mds_main.c:0151] TR mds: received 77 
from 2020f53b80025, pid 5335
Sep  8 13:47:55.811903 osafimmnd [5233:mds_main.c:0167] TR dest 2020f53b80025 
already exist
Sep  8 13:47:55.811994 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, 
adest:2020f53b80025
Sep  8 13:47:55.812008 osafimmnd [5233:mds_c_db.c:2361]  
mds_process_info_del: dest:2020f53b80025, pid:5335
Sep  8 13:47:55.812091 osafimmnd [5233:mds_c_api.c:1614] TR svc UP process_info 
NOTEXIST, svc:26, adest:2020f53b80025
Sep  8 13:47:55.812104 osafimmnd [5233:mds_c_db.c:2352]  
mds_process_info_add: dest:2020f53b80025, pid:0

Sep  8 13:47:55.812194 osafimmnd [5233:immnd_evt.c:0726] WA 
immnd_evt_proc_imm_init: PID 0 (5335) for 2020f53b80025, MDS problem?
Sep  8 13:47:55.812742 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN cnt:0, 
adest:2020f53b80025
Sep  8 13:47:55.812760 osafimmnd [5233:mds_c_db.c:2361]  
mds_process_info_del: dest:2020f53b80025, pid:0

pid:5335 is amfnd




---

** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from 
saImmOmInitialize**

**Status:** review
**Milestone:** 4.5.0
**Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt
**Last Updated:** Mon Sep 15, 2014 01:45 PM UTC
**Owner:** Hans Feldt

With MDS/TIPC amfnd randomly fails to start causing failed opensaf start.

osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem?

Reason is a random timing variation of the TIPC topology DOWN event. This 
sometimes causes the DOWN event to wrongly delete a newly added process_info 
entry.

The trigger for this problem is that some IMM clients in opensaf like amfnd 
does not reuse IMM handles but initialize/finalize in a far from optimal way. 
This should also be fixed.

The solution under test consists of two parts:
1) The MDS down event just starts a timer in MDS, when the timeout event 
happens the process_info entry is deleted.

2) A new explicit disconnect() is added to the MDS API which is used by IMMA 
library when it is about to close down the whole core library.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


Re: [tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-17 Thread ramesh betham

Hi Hans,

Thanks for providing the traces. These traces gave more clarity about 
the race condition happening between authentication and TIPC sockets.


I Ack for the latest patch with one comment:

   /There can be a memleak, if the client process exits after the
   expiry of MDS DOWN_TMR (and with out calling
   mds_auth_server_disconnect()). So a simple function to check for the
   stale (i.e., no PID exist) process_info structs and delete them in
   mds_register_callback() may help. /

This latest patch stabilizes authentication feature and I sincerely 
appreciate for listening to my review comments.


Best Regards,
Ramesh.

On 9/17/2014 7:33 PM, Hans Feldt wrote:


  * *Comment*:

Here's a trace snippet from an opensaf start that it is hard to explain...

Sep 8 13:47:55.90 osafimmnd [5233:mds_c_api.c:1614] TR svc UP 
process_info NOTEXIST, svc:26, adest:2020f53b80025
Sep 8 13:47:55.777801 osafimmnd [5233:mds_c_db.c:2352]  
mds_process_info_add: dest:2020f53b80025, pid:0
Sep 8 13:47:55.777987 osafimmnd [5233:mds_main.c:0151] TR mds: 
received 77 from 2020f53b80025, pid 5335
Sep 8 13:47:55.778006 osafimmnd [5233:mds_main.c:0167] TR dest 
2020f53b80025 already exist
Sep 8 13:47:55.792541 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN 
cnt:0, adest:2020f53b80025
Sep 8 13:47:55.792557 osafimmnd [5233:mds_c_db.c:2361]  
mds_process_info_del: dest:2020f53b80025, pid:5335


Sep 8 13:47:55.792655 osafimmnd [5233:mds_c_api.c:1614] TR svc UP 
process_info NOTEXIST, svc:26, adest:2020f53b80025
Sep 8 13:47:55.792679 osafimmnd [5233:mds_c_db.c:2352]  
mds_process_info_add: dest:2020f53b80025, pid:0
Sep 8 13:47:55.792701 osafimmnd [5233:mds_main.c:0151] TR mds: 
received 77 from 2020f53b80025, pid 5335
Sep 8 13:47:55.792945 osafimmnd [5233:mds_main.c:0167] TR dest 
2020f53b80025 already exist
Sep 8 13:47:55.811859 osafimmnd [5233:mds_main.c:0151] TR mds: 
received 77 from 2020f53b80025, pid 5335
Sep 8 13:47:55.811903 osafimmnd [5233:mds_main.c:0167] TR dest 
2020f53b80025 already exist
Sep 8 13:47:55.811994 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN 
cnt:0, adest:2020f53b80025
Sep 8 13:47:55.812008 osafimmnd [5233:mds_c_db.c:2361]  
mds_process_info_del: dest:2020f53b80025, pid:5335
Sep 8 13:47:55.812091 osafimmnd [5233:mds_c_api.c:1614] TR svc UP 
process_info NOTEXIST, svc:26, adest:2020f53b80025
Sep 8 13:47:55.812104 osafimmnd [5233:mds_c_db.c:2352]  
mds_process_info_add: dest:2020f53b80025, pid:0


Sep 8 13:47:55.812194 osafimmnd [5233:immnd_evt.c:0726] WA 
immnd_evt_proc_imm_init: PID 0 (5335) for 2020f53b80025, MDS problem?
Sep 8 13:47:55.812742 osafimmnd [5233:mds_c_api.c:2675] TR svc 26 DOWN 
cnt:0, adest:2020f53b80025
Sep 8 13:47:55.812760 osafimmnd [5233:mds_c_db.c:2361]  
mds_process_info_del: dest:2020f53b80025, pid:0


pid:5335 is amfnd



*[tickets:#1050] http://sourceforge.net/p/opensaf/tickets/1050 amfnd 
sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize*


*Status:* review
*Milestone:* 4.5.0
*Created:* Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt
*Last Updated:* Mon Sep 15, 2014 01:45 PM UTC
*Owner:* Hans Feldt

With MDS/TIPC amfnd randomly fails to start causing failed opensaf start.

osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem?

Reason is a random timing variation of the TIPC topology DOWN event. 
This sometimes causes the DOWN event to wrongly delete a newly added 
process_info entry.


The trigger for this problem is that some IMM clients in opensaf like 
amfnd does not reuse IMM handles but initialize/finalize in a far from 
optimal way. This should also be fixed.


The solution under test consists of two parts:
1) The MDS down event just starts a timer in MDS, when the timeout 
event happens the process_info entry is deleted.


2) A new explicit disconnect() is added to the MDS API which is used 
by IMMA library when it is about to close down the whole core library.




Sent from sourceforge.net because 
opensaf-tickets@lists.sourceforge.net is subscribed to 
https://sourceforge.net/p/opensaf/tickets/ 
https://sourceforge.net/p/opensaf/tickets


To unsubscribe from further messages, a project admin can change 
settings at https://sourceforge.net/p/opensaf/admin/tickets/options. 
Or, if this is a mailing list, you can unsubscribe from the mailing list.




--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk


___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets



[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-15 Thread Hans Feldt
Please check/test these patches


Attachment: osaf-1050.tgz (5.5 kB; application/x-compressed-tar) 


---

** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from 
saImmOmInitialize**

**Status:** review
**Milestone:** 4.5.0
**Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt
**Last Updated:** Fri Sep 12, 2014 09:26 AM UTC
**Owner:** Hans Feldt

With MDS/TIPC amfnd randomly fails to start causing failed opensaf start.

osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem?

Reason is a random timing variation of the TIPC topology DOWN event. This 
sometimes causes the DOWN event to wrongly delete a newly added process_info 
entry.

The trigger for this problem is that some IMM clients in opensaf like amfnd 
does not reuse IMM handles but initialize/finalize in a far from optimal way. 
This should also be fixed.

The solution under test consists of two parts:
1) The MDS down event just starts a timer in MDS, when the timeout event 
happens the process_info entry is deleted.

2) A new explicit disconnect() is added to the MDS API which is used by IMMA 
library when it is about to close down the whole core library.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-12 Thread Hans Feldt
changeset:   5766:00a8950e6888
branch:  opensaf-4.5.x
parent:  5764:ed452ef6f6d4
user:Hans Feldt hans.fe...@ericsson.com
date:Thu Sep 11 15:42:47 2014 +0200
summary: imma: fix potential race with is_immnd_up [#1050]

changeset:   5767:e7bad9ce4537
tag: tip
parent:  5765:d09a52b10727
user:Hans Feldt hans.fe...@ericsson.com
date:Thu Sep 11 15:42:47 2014 +0200
summary: imma: fix potential race with is_immnd_up [#1050]



---

** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from 
saImmOmInitialize**

**Status:** review
**Milestone:** 4.5.0
**Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt
**Last Updated:** Wed Sep 10, 2014 09:16 AM UTC
**Owner:** Hans Feldt

With MDS/TIPC amfnd randomly fails to start causing failed opensaf start.

osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem?

Reason is a random timing variation of the TIPC topology DOWN event. This 
sometimes causes the DOWN event to wrongly delete a newly added process_info 
entry.

The trigger for this problem is that some IMM clients in opensaf like amfnd 
does not reuse IMM handles but initialize/finalize in a far from optimal way. 
This should also be fixed.

The solution under test consists of two parts:
1) The MDS down event just starts a timer in MDS, when the timeout event 
happens the process_info entry is deleted.

2) A new explicit disconnect() is added to the MDS API which is used by IMMA 
library when it is about to close down the whole core library.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-10 Thread Hans Feldt
- **status**: accepted -- review



---

** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from 
saImmOmInitialize**

**Status:** review
**Milestone:** 4.5.0
**Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt
**Last Updated:** Tue Sep 09, 2014 07:08 AM UTC
**Owner:** Hans Feldt

With MDS/TIPC amfnd randomly fails to start causing failed opensaf start.

osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem?

Reason is a random timing variation of the TIPC topology DOWN event. This 
sometimes causes the DOWN event to wrongly delete a newly added process_info 
entry.

The trigger for this problem is that some IMM clients in opensaf like amfnd 
does not reuse IMM handles but initialize/finalize in a far from optimal way. 
This should also be fixed.

The solution under test consists of two parts:
1) The MDS down event just starts a timer in MDS, when the timeout event 
happens the process_info entry is deleted.

2) A new explicit disconnect() is added to the MDS API which is used by IMMA 
library when it is about to close down the whole core library.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-09 Thread Hans Feldt



---

** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from 
saImmOmInitialize**

**Status:** accepted
**Milestone:** 4.5.0
**Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt
**Last Updated:** Tue Sep 09, 2014 07:08 AM UTC
**Owner:** Hans Feldt

With MDS/TIPC amfnd randomly fails to start causing failed opensaf start.

osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem?

Reason is a random timing variation of the TIPC topology DOWN event. This 
sometimes causes the DOWN event to wrongly delete a newly added process_info 
entry.

The trigger for this problem is that some IMM clients in opensaf like amfnd 
does not reuse IMM handles but initialize/finalize in a far from optimal way. 
This should also be fixed.

The solution under test consists of two parts:
1) The MDS down event just starts a timer in MDS, when the timeout event 
happens the process_info entry is deleted.

2) A new explicit disconnect() is added to the MDS API which is used by IMMA 
library when it is about to close down the whole core library.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce.
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets