[tickets] [opensaf:tickets] #728 amfd to exit immediately upon receiving local amfnd down during 'opensaf stop'

2014-01-17 Thread Mathi Naickan
--- ** [tickets:#728] amfd to exit immediately upon receiving local amfnd down during 'opensaf stop'** **Status:** accepted **Labels:** AMFD exit AMFND down opensafd stop **Created:** Fri Jan 17, 2014 08:22 AM UTC by Mathi Naickan **Last Updated:** Fri Jan 17, 2014 08:22 AM UTC **Owner:**

[tickets] [opensaf:tickets] #721 IMMD asserted when trying to become active during failover

2014-01-17 Thread Anders Bjornerstedt
Chenging component for this ticket to RDE. Possibly it should be FM. RDE at SC2 is changing role to active while SC2 still has contact over MDS with SC1. Jan 15 18:24:01 SLES-64BIT-SLOT2 osafrded[2616]: NO rde_rde_set_role: role set to 1 --- ** [tickets:#721] IMMD asserted when trying to

[tickets] [opensaf:tickets] #724 imm: sync with payload node resulted in controller reboots

2014-01-17 Thread Anders Bjornerstedt
- **Milestone**: future -- 4.4.RC1 --- ** [tickets:#724] imm: sync with payload node resulted in controller reboots** **Status:** accepted **Created:** Thu Jan 16, 2014 11:22 AM UTC by surender khetavath **Last Updated:** Fri Jan 17, 2014 08:42 AM UTC **Owner:** Anders Bjornerstedt

[tickets] [opensaf:tickets] #728 amfd to exit immediately upon receiving local amfnd down during 'opensaf stop'

2014-01-17 Thread Mathi Naickan
Please ignore the patch in https://sourceforge.net/p/opensaf/mailman/opensaf-devel/thread/patchbomb.1389947750%40dhcp-hyd-scp-5fl-10-176-178-129.in.oracle.com/#msg31856150 Instead consider the patch in

[tickets] [opensaf:tickets] #724 imm: sync with payload node resulted in controller reboots

2014-01-17 Thread Neelakanta Reddy
when the SC-2 syslog is analyzed the SC-2 IMMND also got healthcheck timeout at the same time. Jan 16 21:57:12 SLES-SLOT3 osafamfnd[4443]: NO 'safComp=IMMND,safSu=SC-2,safSg=NoRed,safApp=OpenSAF' faulted due to 'healthCheckcallbackTimeout' : Recovery is 'componentRestart' so, IMMD does not

[tickets] [opensaf:tickets] #729 Node did not go for reboot when error reported on npi component, recommended recovery as node switchover

2014-01-17 Thread Sirisha Alla
--- ** [tickets:#729] Node did not go for reboot when error reported on npi component, recommended recovery as node switchover** **Status:** unassigned **Created:** Fri Jan 17, 2014 09:06 AM UTC by Sirisha Alla **Last Updated:** Fri Jan 17, 2014 09:06 AM UTC **Owner:** nobody Changeset :

[tickets] [opensaf:tickets] #724 imm: sync with payload node resulted in controller reboots

2014-01-17 Thread Anders Bjornerstedt
Yes (you mean huge *number* of objects in one CCB) I suppose that is one possible fix. But to me it is strange that the IMMND processing takes such a long time. It is just the RAM processing we are talking about here, not the PBE processing, since it is the IMMND that submerges in processing and

[tickets] [opensaf:tickets] #729 Node did not go for reboot when error reported on npi component, recommended recovery as node switchover

2014-01-17 Thread Sirisha Alla
- Description has changed: Diff: --- old +++ new @@ -22,7 +22,15 @@ PL-3 should go for reboot, but it did not happen. -Note: For 2N model with above configuration, node went for reboot successfully. +The same scenario with 2N model observed that amf is crashed + +-

[tickets] [opensaf:tickets] #724 imm: sync with payload node resulted in controller reboots

2014-01-17 Thread surender khetavath
The system was not at all overloaded. The memory available is ~8GB cat /proc/meminfo MemTotal:7945404 kB MemFree: 7614556 kB Buffers: 11980 kB Cached: 127952 kB SwapCached:0 kB Also it is physical m/c not a VM. --- ** [tickets:#724] imm: sync

[tickets] [opensaf:tickets] #724 imm: sync with payload node resulted in controller reboots

2014-01-17 Thread Anders Bjornerstedt
This ticket also raises a general issue about the health-check timer approach of OpenSAF. The out of the box OpenSAF has quite a large number of such timeouts in its AMF configuration data, expressed as absolute and real-time values. But OpenSAF in general will execute on many different platforms

[tickets] [opensaf:tickets] #721 IMMD asserted when trying to become active during failover

2014-01-17 Thread Mathi Naickan
Anders, Some background information. FM used to start failover processing upon receiving NODE_DOWN event. From 4.4 onwards FM subscribes to AVND down events to start failover processing. This is based on the fact that AVND is the last process to exit (barring AMFD) i.e. after all the

[tickets] [opensaf:tickets] #721 IMMD asserted when trying to become active during failover

2014-01-17 Thread Mathi Naickan
- **Component**: rde -- osaf --- ** [tickets:#721] IMMD asserted when trying to become active during failover** **Status:** unassigned **Created:** Thu Jan 16, 2014 07:32 AM UTC by Sirisha Alla **Last Updated:** Fri Jan 17, 2014 08:37 AM UTC **Owner:** nobody The issue is seen on changeset

[tickets] [opensaf:tickets] Re: #721 IMMD asserted when trying to become active during failover

2014-01-17 Thread Anders Bjornerstedt
Hi Mathi, In this test case it is the amfnd (==AVND I assume) that is killed. So it goes down first. That explains why this particular test is so provocative then. The problem with removing this assert is that it is like opening a can of worms. It means that I starting to try to support a

Re: [tickets] [opensaf:tickets] Re: #721 IMMD asserted when trying to become active during failover

2014-01-17 Thread Anders Björnerstedt
There seems even a risk that the old active IMMD could beexecuting overlapped with the new active IMMD executing. Nooo. :-) /AndersBj From: Anders Bjornerstedt [mailto:ander...@users.sf.net] Sent: den 17 januari 2014 11:42 To:

[tickets] [opensaf:tickets] Re: #721 IMMD asserted when trying to become active during failover

2014-01-17 Thread Anders Bjornerstedt
There seems even a risk that the old active IMMD could beexecuting overlapped with the new active IMMD executing. Nooo. :-) /AndersBj From: Anders Bjornerstedt [mailto:ander...@users.sf.net] Sent: den 17 januari 2014 11:42 To:

[tickets] [opensaf:tickets] #721 IMMD asserted when trying to become active during failover

2014-01-17 Thread Mathi Naickan
:-) Well yeah, but it atleast gives ideas for alternative approaches! For eg:- One vague idea could be that FM could check for the existince for any other core opensaf service when AMFND down arrived and can thus differentiate an AMFND kill and an opensafd stop, this way it will postpone acting

[tickets] [opensaf:tickets] #690 Opensaf start failed when RDE could not RESPAWN.

2014-01-17 Thread Anders Widell
Here you are trying to set the scheduling policy to 3 (SCHED_BATCH), which is not a valid scheduling policy for threads. We can add a check that falls back to the default policy whenever an invalid policy is encountered. --- ** [tickets:#690] Opensaf start failed when RDE could not

[tickets] [opensaf:tickets] #636 IMM: Missing immomtest cases for NO_DANGLING

2014-01-17 Thread Zoran Milinkovic
opensaf-4.4.x: changeset: 4815:c8cb0beaa27e branch: opensaf-4.4.x parent: 4807:bb5a37c82405 user:Zoran Milinkovic zoran.milinko...@ericsson.com date:Thu Jan 16 12:48:29 2014 +0100 summary: IMMTEST: add new test cases for NO_DANGLING flag [#636] -

[tickets] [opensaf:tickets] #636 IMM: Missing immomtest cases for NO_DANGLING

2014-01-17 Thread Zoran Milinkovic
- **status**: review -- fixed --- ** [tickets:#636] IMM: Missing immomtest cases for NO_DANGLING ** **Status:** fixed **Created:** Fri Nov 22, 2013 09:50 AM UTC by Anders Bjornerstedt **Last Updated:** Thu Jan 16, 2014 11:58 AM UTC **Owner:** Zoran Milinkovic This is related to ticket (#49)

[tickets] [opensaf:tickets] #726 IMM: wrong error codes in README.NO_DANGLING

2014-01-17 Thread Zoran Milinkovic
- **status**: review -- fixed --- ** [tickets:#726] IMM: wrong error codes in README.NO_DANGLING** **Status:** fixed **Created:** Thu Jan 16, 2014 12:14 PM UTC by Zoran Milinkovic **Last Updated:** Thu Jan 16, 2014 12:27 PM UTC **Owner:** Zoran Milinkovic In two places in README.NO_DANGLING,

[tickets] [opensaf:tickets] #724 imm: sync with payload node resulted in controller reboots

2014-01-17 Thread Anders Bjornerstedt
Has this test been tried without trace on for osafimmnd ? Having trace turned on will be a major factor in slowing down the removal of the 200k object ccb from ram in osafimmnd. If you have note tested without trace and it is not too time consuming to rerun, then please test without trace. Trace

[tickets] [opensaf:tickets] #730 AMF: Allow trace-on in healtcheck-callback-reply to increase timeout

2014-01-17 Thread Anders Bjornerstedt
--- ** [tickets:#730] AMF: Allow trace-on in healtcheck-callback-reply to increase timeout** **Status:** unassigned **Created:** Fri Jan 17, 2014 02:51 PM UTC by Anders Bjornerstedt **Last Updated:** Fri Jan 17, 2014 02:51 PM UTC **Owner:** nobody When trace is turned on for an OpenSAF

Re: [tickets] [opensaf:tickets] Re: #724 imm: sync with payload node resulted in controller reboots

2014-01-17 Thread Hans Feldt
I think we need to improve trouble shooting. We can send abort in cleanup scripts. This has been a pending item for years. Why not just do it? Amf should probably log more and better etc Skickat från min Sony Xperia™-smartphone Anders Bjornerstedt skrev I was mainly thinking about

[tickets] [opensaf:tickets] Re: #724 imm: sync with payload node resulted in controller reboots

2014-01-17 Thread Hans Feldt
I think we need to improve trouble shooting. We can send abort in cleanup scripts. This has been a pending item for years. Why not just do it? Amf should probably log more and better etc Skickat från min Sony Xperia™-smartphone Anders Bjornerstedt skrev I was mainly thinking about