Thanks to Scott Peterson, I now have the hisv service up and running on my 2 OpenSAF 1.0-4 controllers.
Can anyone give me recommendation on what is the best way to test the hisv service, in order to make sure it is properly communicating with my OpenHPI daemon process - which is also running on my controller nodes? I've noticed that there is an hisv_demo.c program under services/hisv/hcd - but it does not appear to get built via the provided Makefile. Any suggestions on how to test the hisv service would be appreciated. Regards, Michael Bishop Open Source & Linux Organization (OSLO) Hewlett-Packard Company 3404 E. Harmony Rd. Bldg. 5L, Post C8, Mailstop 42 Fort Collins, CO 80528-9599 Phone: 970-898-4393 E-Mail: [EMAIL PROTECTED] > -----Original Message----- > From: Petersen Scott-P27052 [mailto:[EMAIL PROTECTED] > Sent: Thursday, November 29, 2007 4:06 PM > To: Bishop, Michael (OSLO R&D); [email protected] > Subject: RE: [Users] Problems starting up hisv (HPI) service > > > Hi Michael, > > I looked at your BOM file and it appears that not all of the sections > were uncommented for the HISV component. I uncommented those sections > and the updated file is attached. > > Hope this helps > > Scott G. Petersen > System Validation > Motorola, Inc. > Embedded Communications Computing > 2900 S Diablo Way > Tempe, AZ 85282 > Phone: 602-438-3471 > Cell: 480-600-6964 > Text: [EMAIL PROTECTED] > > > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Bishop, Michael > (OSLO R&D) > Sent: Thursday, November 29, 2007 3:51 PM > To: [email protected] > Subject: [Users] Problems starting up hisv (HPI) service > > Greetings. > > I have OpenSAF 1.0-4 controllers, and I have an instance of OpenHPI > running (HPI-B.01.01 OpenHPI version 2.8.1). > > I built the controllers with hw_mgmt=2. I do have an ncs_hisv > executable in /opt/opensaf/controller/bin. > > I cannot seem to get the hisv service to start properly. Both > controllers get stuck in SCAP. If I crtl-c out of SCAP and > look at the > logs, there is no indication that hisv is attempting to start. > > I'm assuming that my problem most likely lies in my NCSSystemBOM.xml - > which I've attached. I've uncommented the config lines pertaining to > HISV and hisv - as documented in earlier e-mails. > > Thanks for any help or suggestions. > > Regards, > Michael Bishop > Open Source & Linux Organization (OSLO) > Hewlett-Packard Company > 3404 E. Harmony Rd. Bldg. 5L, Post C8, Mailstop 42 Fort Collins, CO > 80528-9599 > Phone: 970-898-4393 > E-Mail: [EMAIL PROTECTED] > > > > -----Original Message----- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] Behalf Of Hans Feldt > > Sent: Thursday, November 29, 2007 3:03 AM > > To: Gudipalli S.-G19449 > > Cc: [email protected] > > Subject: Re: [Users] Controller HA mechanisms > > > > > > > > What do you mean by "configure NID config"? > > > > Change nid so that stays alive and supervise its children? > > > > Thanks, > > Hans > > > > Gudipalli S.-G19449 wrote: > > > Hi hans, > > > > > > > > > In a system the controller nodes that will have RDE/SCAP > > will power > > > on by self. > > > > > > Case both blades in the box, the power to the box is applied: > > > ------------------------------------------------------------- > > > > > > Sub case1: > > > ---------- > > > RDE on node 1 becomes active. > > > SCAP on node 1 starts but fails to complete Init successfully. > > > > > > We expect the platform vendor porting openSAF to configure > > NID config to > > > Reboot the node on failure or have his platform mechanisms > > do that for > > > him. > > > > > > Sub case2: > > > ---------- > > > RDE on node 1 becomes active. > > > SCAP on node 1 starts completes Init successfully. > > > Immediately afterwards crashes. > > > > > > Since the two nodes node 1 and node 2 were in the box when > > the power was > > > applied > > > We expect that given small variations in the boot times > > node 2 will be > > > at SCAP initialization > > > Before node 1 is successfully initialized. Since the other > > RDE/SCAP is > > > there this > > > Situation is also solved. > > > > > > Case one blade in the box, the power to the box is applied: > > > ---------------------------------------------------------------- > > > > > > Sub case1: > > > ---------- > > > RDE on node 1 becomes active. > > > SCAP on node 1 starts but fails to complete Init successfully. > > > > > > We expect the platform vendor porting openSAF to configure > > NID config to > > > Reboot the node on failure or have his platform mechanisms > > do that for > > > him. > > > > > > Sub case2: > > > ---------- > > > RDE on node 1 becomes active. > > > SCAP on node 1 starts completes Init successfully. > > > Immediately afterwards crashes. > > > > > > This is a double fault case a manual repair of restarting > > this single > > > node > > > Is required. If the platform is normally run like this then > > the platform > > > Vendor can have his fault manager track SCAP and on its > > death take the > > > Necessary recover/repair actions. > > > > > > Regards > > > Sugadeesh > > > > > >> -----Original Message----- > > >> From: [EMAIL PROTECTED] > > >> [mailto:[EMAIL PROTECTED] On Behalf Of Hans Feldt > > >> Sent: Thursday, November 29, 2007 2:43 AM > > >> To: Saha Sayandeb-G19428 > > >> Cc: [email protected] > > >> Subject: Re: [Users] Controller HA mechanisms > > >> > > >> > > >> > > >>> -----Original Message----- > > >>> From: Saha Sayandeb-G19428 [mailto:[EMAIL PROTECTED] > > >>> Sent: den 28 november 2007 18:23 > > >>> To: Hans Feldt > > >>> Cc: [email protected] > > >>> Subject: RE: [Users] Controller HA mechanisms > > >>> > > >>> Hans, > > >>> > > >>> Comments below ... > > >>> > > >>>> How does OpenSAF handle the following scenario: > > >>>> > > >>>> - Controller 1 (C1) power on > > >>>> - C1 RDE starts and decides to be active since it is > alone in the > > > >>>> cluster > > >>>> - C1 PSR or AMF dies due to some reason > > >>>> - Controller 2 (C2) power on > > >>>> - C2 RDE starts and gets the role standby from RDE on C1 > > >>>> - C2 waits forever to get synced from C1 > > >>>> > > >>>> Some issues: > > >>>> C1 RDE claims to be active although it is not > > >>>> C1 does not reboot > > >>>> C2 does not reboot when its looses contact with the active > > >>> controller > > >>>> and not in sync. > > >>>> C2 cannot become active if we reboot C1 > > >>>> > > >>>> Comments? > > >>> [SS] I simulated this condition quite easily by simply > killing the > > > >>> ncs_scap process in the one and only active controller and then > > >>> running the get_ha_state command and as you say the RDE in this > > >>> controller still keeps thinking that it is active which > > >> prevents the > > >>> second controller to obtain the active state. So this is a > > >> hole as the > > >>> RDE has no clue that the Avd+AvM has crashed. I guess we > > >> could add a > > >>> role heart-beat from the Avd+AvM to the RDE to ensure that > > >> the RDE is > > >>> always in-synch with what's going on and can relinquish > the active > > > >>> state so that the other controller can become active > under such a > > >>> circumstance. > > >>> But this whole scenario of having only one controller > > which crashes > > >>> and then the second one that tries to come up is probably not so > > >>> common or do you think it will be because of the way > > >> OpenSAF waits 3 > > >>> minutes before rebooting payload blades when AvD goes down? > > >> No I just stumbled on this since we're doing a lot power > on/off of > > >> controllers and fail-overs at the moment. > > >> > > >> As a solution, what if nid stays alive and supervise its > children? > > >> If rde or scap dies, nid reboots the system. > > >> > > >> Cheers, > > >> Hans > > >> > > >>> Sayan > > >>> > > >>>> Regards, > > >>>> Hans > > >>>> _______________________________________________ > > >>>> Users mailing list > > >>>> [email protected] > > >>>> http://list.opensaf.org/maillist/listinfo/users > > >>>> > > >> _______________________________________________ > > >> Users mailing list > > >> [email protected] > > >> http://list.opensaf.org/maillist/listinfo/users > > >> > > > > > > > _______________________________________________ > > Users mailing list > > [email protected] > > http://list.opensaf.org/maillist/listinfo/users > > > _______________________________________________ Users mailing list [email protected] http://list.opensaf.org/maillist/listinfo/users
